[PATCH v10 00/20] index-helper/watchman

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[PATCH v10 20/20] untracked-cache: config option

David Turner
Add a config option to populate the untracked cache.

For installations that have centrally-managed configuration, it's
easier to set a config once than to run update-index on every
repository.

Signed-off-by: David Turner <[hidden email]>
---
 Documentation/config.txt | 4 ++++
 read-cache.c             | 7 ++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 385ea66..c7b76ef 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1848,6 +1848,10 @@ imap::
  The configuration variables in the 'imap' section are described
  in linkgit:git-imap-send[1].
 
+index.adduntrackedcache::
+ Automatically populate the untracked cache whenever the index
+ is written.
+
 index.addwatchmanextension::
  Automatically add the watchman extension to the index whenever
  it is written.
diff --git a/read-cache.c b/read-cache.c
index 22c64d0..4a1cccf 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2472,7 +2472,7 @@ static int do_write_index(struct index_state *istate, int newfd,
  int entries = istate->cache_nr;
  struct stat st;
  struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
- int watchman = 0;
+ int watchman = 0, untracked = 0;
  uint64_t start = getnanotime();
 
  for (i = removed = extended = 0; i < entries; i++) {
@@ -2502,6 +2502,11 @@ static int do_write_index(struct index_state *istate, int newfd,
     !the_index.last_update)
  the_index.last_update = xstrdup("");
 
+ if (!git_config_get_bool("index.adduntrackedcache", &untracked) &&
+    untracked &&
+    !istate->untracked)
+ add_untracked_cache(&the_index);
+
  hdr_version = istate->version;
 
  hdr.hdr_signature = htonl(CACHE_SIGNATURE);
--
2.4.2.767.g62658d5-twtrsrc

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v10 03/20] pkt-line: add gentle version of packet_write

Junio C Hamano
In reply to this post by David Turner
David Turner <[hidden email]> writes:

> packet_write calls write_or_die, which dies with a sigpipe even if
> calling code has explicitly blocked that signal.
>
> Add packet_write_gently and packet_flush_gently, which don't.  Soon,
> we will use this for communication with git index-helper, which, being
> merely an optimization, should be permitted to die without disrupting
> clients.
>
> Signed-off-by: David Turner <[hidden email]>
> ---

Looks quite sensible.  Thanks.

>  pkt-line.c | 18 ++++++++++++++++++
>  pkt-line.h |  2 ++
>  2 files changed, 20 insertions(+)
>
> diff --git a/pkt-line.c b/pkt-line.c
> index 62fdb37..f964446 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -91,6 +91,12 @@ void packet_flush(int fd)
>   write_or_die(fd, "0000", 4);
>  }
>  
> +int packet_flush_gently(int fd)
> +{
> + packet_trace("0000", 4, 1);
> + return write_in_full(fd, "0000", 4) != 4;
> +}
> +
>  void packet_buf_flush(struct strbuf *buf)
>  {
>   packet_trace("0000", 4, 1);
> @@ -130,6 +136,18 @@ void packet_write(int fd, const char *fmt, ...)
>   write_or_die(fd, buf.buf, buf.len);
>  }
>  
> +int packet_write_gently(int fd, const char *fmt, ...)
> +{
> + static struct strbuf buf = STRBUF_INIT;
> + va_list args;
> +
> + strbuf_reset(&buf);
> + va_start(args, fmt);
> + format_packet(&buf, fmt, args);
> + va_end(args);
> + return write_in_full(fd, buf.buf, buf.len) != buf.len;
> +}
> +
>  void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
>  {
>   va_list args;
> diff --git a/pkt-line.h b/pkt-line.h
> index 3cb9d91..deffcb5 100644
> --- a/pkt-line.h
> +++ b/pkt-line.h
> @@ -20,7 +20,9 @@
>   * side can't, we stay with pure read/write interfaces.
>   */
>  void packet_flush(int fd);
> +int packet_flush_gently(int fd);
>  void packet_write(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
> +int packet_write_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
>  void packet_buf_flush(struct strbuf *buf);
>  void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v10 11/20] index-helper: use watchman to avoid refreshing index with lstat()

Ramsay Jones-2
In reply to this post by David Turner


On 12/05/16 21:20, David Turner wrote:
> From: Nguyễn Thái Ngọc Duy <[hidden email]>
[snip]

>  
> +/* in ms */
> +#define WATCHMAN_TIMEOUT 1000
> +
> +static int poke_and_wait_for_reply(int fd)
> +{
> + struct strbuf buf = STRBUF_INIT;
> + int ret = -1;
> + struct pollfd pollfd;
> + int bytes_read;
> + char reply_buf[4096];
> + const char *requested_capabilities = "";
> +
> +#ifdef USE_WATCHMAN
> + requested_capabilities = "watchman";
> +#endif
> +
> + if (fd < 0)
> + return -1;
> +
> + strbuf_addf(&buf, "poke %d %s", getpid(), requested_capabilities);
> + if (packet_write_gently(fd, buf.buf, buf.len))

This is not causing a problem or bug, but is none the less not
correct - as you know, packet_write_gently() takes a 'printf' like
variable argument list. (So, buf.buf is the format specifier and
buf.len is an unused arg).

I think I would write this as:

        strbuf_addf(&buf, "poke %d", getpid());
        if (requested_capabilities && *requested_capabilities)
                strbuf_addf(&buf, " %s", requested_capabilities);
        if (packet_write_gently(fd, "%s", buf.buf))

... or something similar. [Note, just typing into my email client, so
it's not been close to a compiler.]

> + return -1;
> + if (packet_flush_gently(fd))
> + return -1;

Why are you sending a flush packet - doesn't the index-helper
simply ignore it?

I haven't tried this yet BTW, just reading patches as they float
on past... ;-)

ATB,
Ramsay Jones

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v10 11/20] index-helper: use watchman to avoid refreshing index with lstat()

David Turner
On Fri, 2016-05-13 at 00:10 +0100, Ramsay Jones wrote:

>
> On 12/05/16 21:20, David Turner wrote:
> > From: Nguyễn Thái Ngọc Duy <[hidden email]>
> [snip]
>
> >  
> > +/* in ms */
> > +#define WATCHMAN_TIMEOUT 1000
> > +
> > +static int poke_and_wait_for_reply(int fd)
> > +{
> > + struct strbuf buf = STRBUF_INIT;
> > + int ret = -1;
> > + struct pollfd pollfd;
> > + int bytes_read;
> > + char reply_buf[4096];
> > + const char *requested_capabilities = "";
> > +
> > +#ifdef USE_WATCHMAN
> > + requested_capabilities = "watchman";
> > +#endif
> > +
> > + if (fd < 0)
> > + return -1;
> > +
> > + strbuf_addf(&buf, "poke %d %s", getpid(),
> > requested_capabilities);
> > + if (packet_write_gently(fd, buf.buf, buf.len))
>
> This is not causing a problem or bug, but is none the less not
> correct - as you know, packet_write_gently() takes a 'printf' like
> variable argument list. (So, buf.buf is the format specifier and
> buf.len is an unused arg).
>
> I think I would write this as:
>
> strbuf_addf(&buf, "poke %d", getpid());
> if (requested_capabilities && *requested_capabilities)
> strbuf_addf(&buf, " %s", requested_capabilities);
> if (packet_write_gently(fd, "%s", buf.buf))
>
> ... or something similar. [Note, just typing into my email client, so
> it's not been close to a compiler.]

Thanks for the report. I'll fix it.

I'm going to just send the requested_capabilities regardless of whether
they are empty -- it won't hurt.  

> > + return -1;
> > + if (packet_flush_gently(fd))
> > + return -1;
>
> Why are you sending a flush packet - doesn't the index-helper
> simply ignore it?

It's not the packet that I'm excited about -- it's that later,
packet_write might be buffered (according to a docstring).  So I want
to ensure that the writes actually go out *now*.

> I haven't tried this yet BTW, just reading patches as they float
> on past... ;-)
>
> ATB,
> Ramsay Jones
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH] index-helper: use watchman to avoid refreshing index with lstat()

David Turner
In reply to this post by Ramsay Jones-2
From: Nguyễn Thái Ngọc Duy <[hidden email]>

Watchman is hidden behind index-helper. Before git tries to read the
index from shm, it notifies index-helper through the socket and waits
for index-helper to prepare a file for sharing memory (with
MAP_SHARED). index-helper then contacts watchman, updates 'WAMA'
extension and put it in a separate file and wakes git up with a reply
to git's socket.

Git uses this extension to not lstat unchanged entries. Git only
trusts the 'WAMA' extension when it's received from the separate file,
not from disk. Unmarked entries are "clean". Marked entries are dirty
from watchman point of view. If it finds out some entries are
'watchman-dirty', but are really unchanged (e.g. the file was changed,
then reverted back), then Git will clear the marking in 'WAMA' before
writing it down.

Hiding watchman behind index-helper means you need both daemons. You
can't run watchman alone. Not so good. But on the other hand, 'git'
binary is not linked to watchman/json libraries, which is good for
packaging. Core git package will run fine without watchman-related
packages. If they need watchman, they can install git-index-helper and
dependencies.

This also lets us trust anything in the untracked cache that we haven't
marked invalid, saving those stat() calls.

Another reason for tying watchman to index-helper is, when used with
untracked cache, we need to keep track of $GIT_WORK_TREE file
listing. That kind of list can be kept in index-helper.

Signed-off-by: Nguyễn Thái Ngọc Duy <[hidden email]>
Signed-off-by: David Turner <[hidden email]>
---

Can we just replace this one patch?

 Documentation/git-index-helper.txt |   6 +
 cache.h                            |   2 +
 dir.c                              |  23 +++-
 dir.h                              |   3 +
 index-helper.c                     | 107 ++++++++++++++--
 read-cache.c                       | 241 ++++++++++++++++++++++++++++++++++---
 6 files changed, 355 insertions(+), 27 deletions(-)

diff --git a/Documentation/git-index-helper.txt b/Documentation/git-index-helper.txt
index e144752..cce00cb 100644
--- a/Documentation/git-index-helper.txt
+++ b/Documentation/git-index-helper.txt
@@ -55,6 +55,12 @@ command. The following commands are used to control the daemon:
  Let the daemon know the index is to be read. It keeps the
  daemon alive longer, unless `--exit-after=0` is used.
 
+"poke <path>":
+ Like "poke", but replies with "OK".  If the index has the
+ watchman extension, index-helper queries watchman, then
+ prepares a shared memory object with the watchman index
+ extension before replying.
+
 All commands and replies are terminated by a NUL byte.
 
 In the event of an error, messages may be written to
diff --git a/cache.h b/cache.h
index 452aea2..633e1dd 100644
--- a/cache.h
+++ b/cache.h
@@ -567,6 +567,7 @@ extern int daemonize(int *);
 
 /* Initialize and use the cache information */
 struct lock_file;
+extern int verify_index(const struct index_state *);
 extern int read_index(struct index_state *);
 extern int read_index_preload(struct index_state *, const struct pathspec *pathspec);
 extern int do_read_index(struct index_state *istate, const char *path,
@@ -574,6 +575,7 @@ extern int do_read_index(struct index_state *istate, const char *path,
 extern int read_index_from(struct index_state *, const char *path);
 extern int is_index_unborn(struct index_state *);
 extern int read_index_unmerged(struct index_state *);
+extern void write_watchman_ext(struct strbuf *sb, struct index_state *istate);
 #define COMMIT_LOCK (1 << 0)
 #define CLOSE_LOCK (1 << 1)
 extern int write_locked_index(struct index_state *, struct lock_file *lock, unsigned flags);
diff --git a/dir.c b/dir.c
index 69e0be6..5058b29 100644
--- a/dir.c
+++ b/dir.c
@@ -597,9 +597,9 @@ static void trim_trailing_spaces(char *buf)
  *
  * If "name" has the trailing slash, it'll be excluded in the search.
  */
-static struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
-    struct untracked_cache_dir *dir,
-    const char *name, int len)
+struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
+     struct untracked_cache_dir *dir,
+     const char *name, int len)
 {
  int first, last;
  struct untracked_cache_dir *d;
@@ -1726,6 +1726,17 @@ static int valid_cached_dir(struct dir_struct *dir,
  if (!untracked)
  return 0;
 
+ if (dir->untracked->use_watchman) {
+ /*
+ * With watchman, we can trust the untracked cache's
+ * valid field.
+ */
+ if (untracked->valid)
+ goto skip_stat;
+ else
+ invalidate_directory(dir->untracked, untracked);
+ }
+
  if (stat(path->len ? path->buf : ".", &st)) {
  invalidate_directory(dir->untracked, untracked);
  memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
@@ -1739,6 +1750,7 @@ static int valid_cached_dir(struct dir_struct *dir,
  return 0;
  }
 
+skip_stat:
  if (untracked->check_only != !!check_only) {
  invalidate_directory(dir->untracked, untracked);
  return 0;
@@ -2625,8 +2637,10 @@ static void free_untracked(struct untracked_cache_dir *ucd)
 
 void free_untracked_cache(struct untracked_cache *uc)
 {
- if (uc)
+ if (uc) {
  free_untracked(uc->root);
+ string_list_clear(&uc->invalid_untracked, 0);
+ }
  free(uc);
 }
 
@@ -2775,6 +2789,7 @@ struct untracked_cache *read_untracked_extension(const void *data, unsigned long
  return NULL;
 
  uc = xcalloc(1, sizeof(*uc));
+ string_list_init(&uc->invalid_untracked, 1);
  strbuf_init(&uc->ident, ident_len);
  strbuf_add(&uc->ident, ident, ident_len);
  load_sha1_stat(&uc->ss_info_exclude, &ouc->info_exclude_stat,
diff --git a/dir.h b/dir.h
index 3d540de..8fd3f9e 100644
--- a/dir.h
+++ b/dir.h
@@ -315,4 +315,7 @@ struct untracked_cache *read_untracked_extension(const void *data, unsigned long
 void write_untracked_extension(struct strbuf *out, struct untracked_cache *untracked);
 void add_untracked_cache(struct index_state *istate);
 void remove_untracked_cache(struct index_state *istate);
+struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
+     struct untracked_cache_dir *dir,
+     const char *name, int len);
 #endif
diff --git a/index-helper.c b/index-helper.c
index a3a9b00..6026c74 100644
--- a/index-helper.c
+++ b/index-helper.c
@@ -8,15 +8,18 @@
 #include "cache.h"
 #include "unix-socket.h"
 #include "pkt-line.h"
+#include "watchman-support.h"
 
 struct shm {
  unsigned char sha1[20];
  void *shm;
  size_t size;
+ pid_t pid;
 };
 
 static struct shm shm_index;
 static struct shm shm_base_index;
+static struct shm shm_watchman;
 static int daemonized, to_verify = 1;
 
 static void release_index_shm(struct shm *is)
@@ -28,10 +31,21 @@ static void release_index_shm(struct shm *is)
  is->shm = NULL;
 }
 
+static void release_watchman_shm(struct shm *is)
+{
+ if (!is->shm)
+ return;
+ munmap(is->shm, is->size);
+ unlink(git_path("shm-watchman-%s-%" PRIuMAX,
+ sha1_to_hex(is->sha1), (uintmax_t)is->pid));
+ is->shm = NULL;
+}
+
 static void cleanup_shm(void)
 {
  release_index_shm(&shm_index);
  release_index_shm(&shm_base_index);
+ release_watchman_shm(&shm_watchman);
 }
 
 static void cleanup(void)
@@ -166,9 +180,10 @@ static void share_the_index(void)
  if (the_index.split_index && the_index.split_index->base)
  share_index(the_index.split_index->base, &shm_base_index);
  share_index(&the_index, &shm_index);
- if (to_verify && !verify_shm())
+ if (to_verify && !verify_shm()) {
  cleanup_shm();
- discard_index(&the_index);
+ discard_index(&the_index);
+ }
 }
 
 static void set_socket_blocking_flag(int fd, int make_nonblocking)
@@ -201,6 +216,80 @@ static void refresh(void)
 
 #ifndef NO_MMAP
 
+#ifdef USE_WATCHMAN
+static void share_watchman(struct index_state *istate,
+   struct shm *is, pid_t pid)
+{
+ struct strbuf sb = STRBUF_INIT;
+ void *shm;
+
+ write_watchman_ext(&sb, istate);
+ if (!shared_mmap_create(sb.len + 20, &shm,
+ git_path("shm-watchman-%s-%" PRIuMAX,
+ sha1_to_hex(istate->sha1),
+ (uintmax_t)pid))) {
+ is->size = sb.len + 20;
+ is->shm = shm;
+ is->pid = pid;
+ hashcpy(is->sha1, istate->sha1);
+
+ memcpy(shm, sb.buf, sb.len);
+ hashcpy((unsigned char *)shm + is->size - 20, is->sha1);
+ }
+ strbuf_release(&sb);
+}
+
+
+static void prepare_with_watchman(pid_t pid)
+{
+ /*
+ * TODO: with the help of watchman, maybe we could detect if
+ * $GIT_DIR/index is updated.
+ */
+ if (!verify_index(&the_index))
+ refresh();
+
+ if (check_watchman(&the_index))
+ return;
+
+ share_watchman(&the_index, &shm_watchman, pid);
+}
+
+static void prepare_index(pid_t pid)
+{
+ if (shm_index.shm == NULL)
+ refresh();
+ release_watchman_shm(&shm_watchman);
+ if (the_index.last_update)
+ prepare_with_watchman(pid);
+}
+
+#endif
+
+static void reply_to_poke(int client_fd, const char *pid_buf)
+{
+ char *capabilities;
+ struct strbuf sb = STRBUF_INIT;
+
+#ifdef USE_WATCHMAN
+ pid_t client_pid = strtoull(pid_buf, NULL, 10);
+
+ prepare_index(client_pid);
+#endif
+ capabilities = strchr(pid_buf, ' ');
+
+ if (!strcmp(capabilities, " watchman"))
+#ifdef USE_WATCHMAN
+ packet_buf_write(&sb, "OK watchman");
+#else
+ packet_buf_write(&sb, "NAK watchman");
+#endif
+ else
+ packet_buf_write(&sb, "OK");
+ if (write_in_full(client_fd, sb.buf, sb.len) != sb.len)
+ warning(_("client write failed"));
+}
+
 static void loop(int fd, int idle_in_seconds)
 {
  assert(idle_in_seconds < INT_MAX / 1000);
@@ -252,11 +341,15 @@ static void loop(int fd, int idle_in_seconds)
  buf[bytes_read] = 0;
  if (!strcmp(buf, "refresh")) {
  refresh();
- } else if (!strcmp(buf, "poke")) {
- /*
- * Just a poke to keep us
- * alive, nothing to do.
- */
+ } else if (starts_with(buf, "poke")) {
+ if (buf[4] == ' ') {
+ reply_to_poke(client_fd, buf + 5);
+ } else {
+ /*
+ * Just a poke to keep us
+ * alive, nothing to do.
+ */
+ }
  } else {
  warning("BUG: Bogus command %s", buf);
  }
diff --git a/read-cache.c b/read-cache.c
index 1719f5a..8ec4be3 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1235,7 +1235,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
  if (!new) {
  const char *fmt;
 
- if (really && cache_errno == EINVAL) {
+ if (really || cache_errno == EINVAL) {
  /* If we are doing --really-refresh that
  * means the index is not valid anymore.
  */
@@ -1375,11 +1375,75 @@ static int verify_hdr(const struct cache_header *hdr, unsigned long size)
  return 0;
 }
 
+static struct untracked_cache_dir *find_untracked_cache_dir(
+ struct untracked_cache *uc, struct untracked_cache_dir *ucd,
+ const char *name)
+{
+ int component_len;
+ const char *end;
+ struct untracked_cache_dir *dir;
+
+ if (!*name)
+ return ucd;
+
+ end = strchr(name, '/');
+ if (end)
+ component_len = end - name;
+ else
+ component_len = strlen(name);
+
+ dir = lookup_untracked(uc, ucd, name, component_len);
+ if (dir)
+ return find_untracked_cache_dir(uc, dir, name + component_len + 1);
+
+ return NULL;
+}
+
 static void mark_no_watchman(size_t pos, void *data)
 {
  struct index_state *istate = data;
+ struct cache_entry *ce = istate->cache[pos];
+ struct strbuf sb = STRBUF_INIT;
+ char *c;
+ struct untracked_cache_dir *dir;
+
  assert(pos < istate->cache_nr);
- istate->cache[pos]->ce_flags |= CE_WATCHMAN_DIRTY;
+ ce->ce_flags |= CE_WATCHMAN_DIRTY;
+
+ if (!istate->untracked || !istate->untracked->root)
+ return;
+
+ strbuf_add(&sb, ce->name, ce_namelen(ce));
+
+ for (c = sb.buf + sb.len - 1; c > sb.buf; c--) {
+ if (*c == '/') {
+ strbuf_setlen(&sb, c - sb.buf);
+ break;
+ }
+ }
+
+ if (c == sb.buf)
+ strbuf_setlen(&sb, 0);
+
+ dir = find_untracked_cache_dir(istate->untracked,
+       istate->untracked->root, sb.buf);
+ if (dir)
+ dir->valid = 0;
+
+ strbuf_release(&sb);
+}
+
+static int mark_untracked_invalid(struct string_list_item *item, void *uc)
+{
+ struct untracked_cache *untracked = uc;
+ struct untracked_cache_dir *dir;
+
+ dir = find_untracked_cache_dir(untracked, untracked->root,
+       item->string);
+ if (dir)
+ dir->valid = 0;
+
+ return 0;
 }
 
 static int read_watchman_ext(struct index_state *istate, const void *data,
@@ -1409,10 +1473,24 @@ static int read_watchman_ext(struct index_state *istate, const void *data,
  ewah_each_bit(bitmap, mark_no_watchman, istate);
  ewah_free(bitmap);
 
- /*
- * TODO: update the untracked cache from the untracked data in this
- * extension.
- */
+ if (istate->untracked && istate->untracked->root) {
+ int i;
+ const char *untracked;
+
+ untracked = (const char *)data + len + 8 + bitmap_size;
+ for (i = 0; i < untracked_nr; ++i) {
+ int len = strlen(untracked);
+ string_list_append(&istate->untracked->invalid_untracked,
+   untracked);
+ untracked += len + 1;
+ }
+
+ for_each_string_list(&istate->untracked->invalid_untracked,
+ mark_untracked_invalid, istate->untracked);
+
+ if (untracked_nr)
+ istate->cache_changed |= WATCHMAN_CHANGED;
+ }
  return 0;
 }
 
@@ -1645,29 +1723,88 @@ static void post_read_index_from(struct index_state *istate)
  tweak_untracked_cache(istate);
 }
 
+/* in ms */
+#define WATCHMAN_TIMEOUT 1000
+
+static int poke_and_wait_for_reply(int fd)
+{
+ int ret = -1;
+ struct pollfd pollfd;
+ int bytes_read;
+ char reply_buf[4096];
+ const char *requested_capabilities = "";
+
+#ifdef USE_WATCHMAN
+ requested_capabilities = "watchman";
+#endif
+
+ if (fd < 0)
+ return -1;
+
+ if (packet_write_gently(fd, "poke %d %s", getpid(), requested_capabilities))
+ return -1;
+ if (packet_flush_gently(fd))
+ return -1;
+
+ /* Now wait for a reply */
+ pollfd.fd = fd;
+ pollfd.events = POLLIN;
+ if (poll(&pollfd, 1, WATCHMAN_TIMEOUT) <= 0)
+ /* No reply or error, giving up */
+ goto done_poke;
+
+ bytes_read = packet_read(fd, NULL, NULL, reply_buf, sizeof(reply_buf),
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_CHOMP_NEWLINE);
+
+ if (bytes_read < 0)
+ goto done_poke;
+
+ if (!strcmp(reply_buf, "NAK watchman")
+#ifdef USE_WATCHMAN
+    || !ends_with(reply_buf, "watchman")
+#endif
+ ) {
+ warning("We requested watchman support from index-helper, but "
+ "it doesn't support it. Please use a version of git "
+ "index-helper with watchman support.");
+ goto done_poke;
+ }
+
+ if (!starts_with(reply_buf, "OK"))
+ goto done_poke;
+
+ ret = 0;
+done_poke:
+ close(fd);
+ return ret;
+}
+
 static int poke_daemon(struct index_state *istate,
        const struct stat *st, int refresh_cache)
 {
  int fd;
- int ret = 0;
- const char *socket_path;
+ int ret = -1;
 
  /* if this is from index-helper, do not poke itself (recursively) */
  if (istate->to_shm)
  return 0;
 
- socket_path = git_path("index-helper.sock");
- if (!socket_path)
+ fd = unix_stream_connect(git_path("index-helper.sock"));
+ if (fd < 0) {
+ warning("Failed to connect to index-helper socket");
+ unlink(git_path("index-helper.sock"));
  return -1;
-
- fd = unix_stream_connect(socket_path);
+ }
  sigchain_push(SIGPIPE, SIG_IGN);
+
  if (refresh_cache) {
  packet_write_gently(fd, "refresh");
+ packet_flush_gently(fd);
+ ret = 0;
  } else {
- packet_write_gently(fd, "poke");
+ ret = poke_and_wait_for_reply(fd);
  }
- packet_flush_gently(fd);
 
  close(fd);
  sigchain_pop(SIGPIPE);
@@ -1737,6 +1874,74 @@ fail:
  return -1;
 }
 
+static void refresh_by_watchman(struct index_state *istate)
+{
+ void *shm = NULL;
+ int length;
+ int i;
+ struct stat st;
+ int fd = -1;
+ const char *path = git_path("shm-watchman-%s-%"PRIuMAX,
+    sha1_to_hex(istate->sha1),
+    (uintmax_t)getpid());
+
+ fd = open(path, O_RDONLY);
+ if (fd < 0)
+ return;
+
+ /*
+ * This watchman data is just for us -- no need to keep it
+ * around once we've got it open.
+ */
+ unlink(path);
+
+ if (fstat(fd, &st) < 0)
+ goto done;
+
+ length = st.st_size;
+ shm = mmap(NULL, length, PROT_READ, MAP_SHARED, fd, 0);
+
+ if (shm == MAP_FAILED)
+ goto done;
+
+ close(fd);
+ fd = -1;
+
+ if (length <= 20 ||
+    hashcmp(istate->sha1, (unsigned char *)shm + length - 20) ||
+    /*
+     * No need to clear CE_WATCHMAN_DIRTY set by 'WAMA' on
+     * disk. Watchman can only set more, not clear any, so
+     * this is OR mask.
+     */
+    read_watchman_ext(istate, shm, length - 20))
+ goto done;
+
+ /*
+ * Now that we've marked the invalid entries in the
+ * untracked-cache itself, we can erase them from the list of
+ * entries to be processed and mark the untracked cache for
+ * watchman usage.
+ */
+ if (istate->untracked) {
+ string_list_clear(&istate->untracked->invalid_untracked, 0);
+ istate->untracked->use_watchman = 1;
+ }
+
+ for (i = 0; i < istate->cache_nr; i++) {
+ struct cache_entry *ce = istate->cache[i];
+ if (ce_stage(ce) || (ce->ce_flags & CE_WATCHMAN_DIRTY))
+ continue;
+ ce_mark_uptodate(ce);
+ }
+done:
+ if (shm)
+ munmap(shm, length);
+
+ if (fd >= 0)
+ close(fd);
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int do_read_index(struct index_state *istate, const char *path, int must_exist)
 {
@@ -1856,7 +2061,7 @@ int read_index_from(struct index_state *istate, const char *path)
  split_index = istate->split_index;
  if (!split_index || is_null_sha1(split_index->base_sha1)) {
  post_read_index_from(istate);
- return ret;
+ goto done;
  }
 
  if (split_index->base)
@@ -1877,6 +2082,10 @@ int read_index_from(struct index_state *istate, const char *path)
     sha1_to_hex(split_index->base->sha1));
  merge_base_index(istate);
  post_read_index_from(istate);
+
+done:
+ if (ret > 0 && istate->from_shm && istate->last_update)
+ refresh_by_watchman(istate);
  return ret;
 }
 
@@ -2178,7 +2387,7 @@ out:
  return 0;
 }
 
-static int verify_index(const struct index_state *istate)
+int verify_index(const struct index_state *istate)
 {
  return verify_index_from(istate, get_index_file());
 }
--
2.4.2.767.g62658d5-twtrsrc

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] index-helper: use watchman to avoid refreshing index with lstat()

Ramsay Jones-2


On 13/05/16 19:27, David Turner wrote:
> From: Nguyễn Thái Ngọc Duy <[hidden email]>
[snip]

> +
> +static void reply_to_poke(int client_fd, const char *pid_buf)
> +{
> + char *capabilities;
> + struct strbuf sb = STRBUF_INIT;
> +
> +#ifdef USE_WATCHMAN
> + pid_t client_pid = strtoull(pid_buf, NULL, 10);
> +
> + prepare_index(client_pid);
> +#endif
> + capabilities = strchr(pid_buf, ' ');

So, if the pid is *not* followed by a space, the capabilities
will be NULL here, and ...

> +
> + if (!strcmp(capabilities, " watchman"))

... we segfault here.

> +#ifdef USE_WATCHMAN
> + packet_buf_write(&sb, "OK watchman");
> +#else
> + packet_buf_write(&sb, "NAK watchman");
> +#endif
> + else
> + packet_buf_write(&sb, "OK");
> + if (write_in_full(client_fd, sb.buf, sb.len) != sb.len)
> + warning(_("client write failed"));
> +}
> +
>  static void loop(int fd, int idle_in_seconds)
>  {
>   assert(idle_in_seconds < INT_MAX / 1000);
> @@ -252,11 +341,15 @@ static void loop(int fd, int idle_in_seconds)
>   buf[bytes_read] = 0;
>   if (!strcmp(buf, "refresh")) {
>   refresh();
> - } else if (!strcmp(buf, "poke")) {
> - /*
> - * Just a poke to keep us
> - * alive, nothing to do.
> - */
> + } else if (starts_with(buf, "poke")) {
> + if (buf[4] == ' ') {
> + reply_to_poke(client_fd, buf + 5);
> + } else {
> + /*
> + * Just a poke to keep us
> + * alive, nothing to do.
> + */
> + }
>   } else {
>   warning("BUG: Bogus command %s", buf);
>   }
> diff --git a/read-cache.c b/read-cache.c
> index 1719f5a..8ec4be3 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1235,7 +1235,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
>   if (!new) {
>   const char *fmt;
>  
> - if (really && cache_errno == EINVAL) {
> + if (really || cache_errno == EINVAL) {
>   /* If we are doing --really-refresh that
>   * means the index is not valid anymore.
>   */
> @@ -1375,11 +1375,75 @@ static int verify_hdr(const struct cache_header *hdr, unsigned long size)
>   return 0;
>  }
>  
> +static struct untracked_cache_dir *find_untracked_cache_dir(
> + struct untracked_cache *uc, struct untracked_cache_dir *ucd,
> + const char *name)
> +{
> + int component_len;
> + const char *end;
> + struct untracked_cache_dir *dir;
> +
> + if (!*name)
> + return ucd;
> +
> + end = strchr(name, '/');
> + if (end)
> + component_len = end - name;
> + else
> + component_len = strlen(name);
> +
> + dir = lookup_untracked(uc, ucd, name, component_len);
> + if (dir)
> + return find_untracked_cache_dir(uc, dir, name + component_len + 1);
> +
> + return NULL;
> +}
> +
>  static void mark_no_watchman(size_t pos, void *data)
>  {
>   struct index_state *istate = data;
> + struct cache_entry *ce = istate->cache[pos];
> + struct strbuf sb = STRBUF_INIT;
> + char *c;
> + struct untracked_cache_dir *dir;
> +
>   assert(pos < istate->cache_nr);
> - istate->cache[pos]->ce_flags |= CE_WATCHMAN_DIRTY;
> + ce->ce_flags |= CE_WATCHMAN_DIRTY;
> +
> + if (!istate->untracked || !istate->untracked->root)
> + return;
> +
> + strbuf_add(&sb, ce->name, ce_namelen(ce));
> +
> + for (c = sb.buf + sb.len - 1; c > sb.buf; c--) {
> + if (*c == '/') {
> + strbuf_setlen(&sb, c - sb.buf);
> + break;
> + }
> + }
> +
> + if (c == sb.buf)
> + strbuf_setlen(&sb, 0);
> +
> + dir = find_untracked_cache_dir(istate->untracked,
> +       istate->untracked->root, sb.buf);
> + if (dir)
> + dir->valid = 0;
> +
> + strbuf_release(&sb);
> +}
> +
> +static int mark_untracked_invalid(struct string_list_item *item, void *uc)
> +{
> + struct untracked_cache *untracked = uc;
> + struct untracked_cache_dir *dir;
> +
> + dir = find_untracked_cache_dir(untracked, untracked->root,
> +       item->string);
> + if (dir)
> + dir->valid = 0;
> +
> + return 0;
>  }
>  
>  static int read_watchman_ext(struct index_state *istate, const void *data,
> @@ -1409,10 +1473,24 @@ static int read_watchman_ext(struct index_state *istate, const void *data,
>   ewah_each_bit(bitmap, mark_no_watchman, istate);
>   ewah_free(bitmap);
>  
> - /*
> - * TODO: update the untracked cache from the untracked data in this
> - * extension.
> - */
> + if (istate->untracked && istate->untracked->root) {
> + int i;
> + const char *untracked;
> +
> + untracked = (const char *)data + len + 8 + bitmap_size;
> + for (i = 0; i < untracked_nr; ++i) {
> + int len = strlen(untracked);
> + string_list_append(&istate->untracked->invalid_untracked,
> +   untracked);
> + untracked += len + 1;
> + }
> +
> + for_each_string_list(&istate->untracked->invalid_untracked,
> + mark_untracked_invalid, istate->untracked);
> +
> + if (untracked_nr)
> + istate->cache_changed |= WATCHMAN_CHANGED;
> + }
>   return 0;
>  }
>  
> @@ -1645,29 +1723,88 @@ static void post_read_index_from(struct index_state *istate)
>   tweak_untracked_cache(istate);
>  }
>  
> +/* in ms */
> +#define WATCHMAN_TIMEOUT 1000
> +
> +static int poke_and_wait_for_reply(int fd)
> +{
> + int ret = -1;
> + struct pollfd pollfd;
> + int bytes_read;
> + char reply_buf[4096];
> + const char *requested_capabilities = "";
> +
> +#ifdef USE_WATCHMAN
> + requested_capabilities = "watchman";
> +#endif
> +
> + if (fd < 0)
> + return -1;
> +
> + if (packet_write_gently(fd, "poke %d %s", getpid(), requested_capabilities))

So, adding the empty capabilities (and more importantly the
separating space) is not so much 'doesn't hurt', rather than
'prevents a core-dump!' ;-)

> + return -1;
> + if (packet_flush_gently(fd))
> + return -1;

And yes, I'd forgotten about the 'maybe sometime in the future, we
could buffer the packets' ...

ATB,
Ramsay Jones

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH v10 ter] index-helper: use watchman to avoid refreshing index with lstat()

David Turner
From: Nguyễn Thái Ngọc Duy <[hidden email]>

Watchman is hidden behind index-helper. Before git tries to read the
index from shm, it notifies index-helper through the socket and waits
for index-helper to prepare a file for sharing memory (with
MAP_SHARED). index-helper then contacts watchman, updates 'WAMA'
extension and put it in a separate file and wakes git up with a reply
to git's socket.

Git uses this extension to not lstat unchanged entries. Git only
trusts the 'WAMA' extension when it's received from the separate file,
not from disk. Unmarked entries are "clean". Marked entries are dirty
from watchman point of view. If it finds out some entries are
'watchman-dirty', but are really unchanged (e.g. the file was changed,
then reverted back), then Git will clear the marking in 'WAMA' before
writing it down.

Hiding watchman behind index-helper means you need both daemons. You
can't run watchman alone. Not so good. But on the other hand, 'git'
binary is not linked to watchman/json libraries, which is good for
packaging. Core git package will run fine without watchman-related
packages. If they need watchman, they can install git-index-helper and
dependencies.

This also lets us trust anything in the untracked cache that we haven't
marked invalid, saving those stat() calls.

Another reason for tying watchman to index-helper is, when used with
untracked cache, we need to keep track of $GIT_WORK_TREE file
listing. That kind of list can be kept in index-helper.

Helped-by: Ramsay Jones <[hidden email]>
Signed-off-by: Nguyễn Thái Ngọc Duy <[hidden email]>
Signed-off-by: David Turner <[hidden email]>
---

Right, let's try this one instead.

The actual segfault that Ramsay points out will not happen in ordinary
usage (since we control both client and server), but it's better to be
clean.

---
 Documentation/git-index-helper.txt |   6 +
 cache.h                            |   2 +
 dir.c                              |  23 +++-
 dir.h                              |   3 +
 index-helper.c                     | 107 ++++++++++++++--
 read-cache.c                       | 241 ++++++++++++++++++++++++++++++++++---
 6 files changed, 355 insertions(+), 27 deletions(-)

diff --git a/Documentation/git-index-helper.txt b/Documentation/git-index-helper.txt
index e144752..cce00cb 100644
--- a/Documentation/git-index-helper.txt
+++ b/Documentation/git-index-helper.txt
@@ -55,6 +55,12 @@ command. The following commands are used to control the daemon:
  Let the daemon know the index is to be read. It keeps the
  daemon alive longer, unless `--exit-after=0` is used.
 
+"poke <path>":
+ Like "poke", but replies with "OK".  If the index has the
+ watchman extension, index-helper queries watchman, then
+ prepares a shared memory object with the watchman index
+ extension before replying.
+
 All commands and replies are terminated by a NUL byte.
 
 In the event of an error, messages may be written to
diff --git a/cache.h b/cache.h
index 452aea2..633e1dd 100644
--- a/cache.h
+++ b/cache.h
@@ -567,6 +567,7 @@ extern int daemonize(int *);
 
 /* Initialize and use the cache information */
 struct lock_file;
+extern int verify_index(const struct index_state *);
 extern int read_index(struct index_state *);
 extern int read_index_preload(struct index_state *, const struct pathspec *pathspec);
 extern int do_read_index(struct index_state *istate, const char *path,
@@ -574,6 +575,7 @@ extern int do_read_index(struct index_state *istate, const char *path,
 extern int read_index_from(struct index_state *, const char *path);
 extern int is_index_unborn(struct index_state *);
 extern int read_index_unmerged(struct index_state *);
+extern void write_watchman_ext(struct strbuf *sb, struct index_state *istate);
 #define COMMIT_LOCK (1 << 0)
 #define CLOSE_LOCK (1 << 1)
 extern int write_locked_index(struct index_state *, struct lock_file *lock, unsigned flags);
diff --git a/dir.c b/dir.c
index 69e0be6..5058b29 100644
--- a/dir.c
+++ b/dir.c
@@ -597,9 +597,9 @@ static void trim_trailing_spaces(char *buf)
  *
  * If "name" has the trailing slash, it'll be excluded in the search.
  */
-static struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
-    struct untracked_cache_dir *dir,
-    const char *name, int len)
+struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
+     struct untracked_cache_dir *dir,
+     const char *name, int len)
 {
  int first, last;
  struct untracked_cache_dir *d;
@@ -1726,6 +1726,17 @@ static int valid_cached_dir(struct dir_struct *dir,
  if (!untracked)
  return 0;
 
+ if (dir->untracked->use_watchman) {
+ /*
+ * With watchman, we can trust the untracked cache's
+ * valid field.
+ */
+ if (untracked->valid)
+ goto skip_stat;
+ else
+ invalidate_directory(dir->untracked, untracked);
+ }
+
  if (stat(path->len ? path->buf : ".", &st)) {
  invalidate_directory(dir->untracked, untracked);
  memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
@@ -1739,6 +1750,7 @@ static int valid_cached_dir(struct dir_struct *dir,
  return 0;
  }
 
+skip_stat:
  if (untracked->check_only != !!check_only) {
  invalidate_directory(dir->untracked, untracked);
  return 0;
@@ -2625,8 +2637,10 @@ static void free_untracked(struct untracked_cache_dir *ucd)
 
 void free_untracked_cache(struct untracked_cache *uc)
 {
- if (uc)
+ if (uc) {
  free_untracked(uc->root);
+ string_list_clear(&uc->invalid_untracked, 0);
+ }
  free(uc);
 }
 
@@ -2775,6 +2789,7 @@ struct untracked_cache *read_untracked_extension(const void *data, unsigned long
  return NULL;
 
  uc = xcalloc(1, sizeof(*uc));
+ string_list_init(&uc->invalid_untracked, 1);
  strbuf_init(&uc->ident, ident_len);
  strbuf_add(&uc->ident, ident, ident_len);
  load_sha1_stat(&uc->ss_info_exclude, &ouc->info_exclude_stat,
diff --git a/dir.h b/dir.h
index 3d540de..8fd3f9e 100644
--- a/dir.h
+++ b/dir.h
@@ -315,4 +315,7 @@ struct untracked_cache *read_untracked_extension(const void *data, unsigned long
 void write_untracked_extension(struct strbuf *out, struct untracked_cache *untracked);
 void add_untracked_cache(struct index_state *istate);
 void remove_untracked_cache(struct index_state *istate);
+struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
+     struct untracked_cache_dir *dir,
+     const char *name, int len);
 #endif
diff --git a/index-helper.c b/index-helper.c
index a3a9b00..b9a050a 100644
--- a/index-helper.c
+++ b/index-helper.c
@@ -8,15 +8,18 @@
 #include "cache.h"
 #include "unix-socket.h"
 #include "pkt-line.h"
+#include "watchman-support.h"
 
 struct shm {
  unsigned char sha1[20];
  void *shm;
  size_t size;
+ pid_t pid;
 };
 
 static struct shm shm_index;
 static struct shm shm_base_index;
+static struct shm shm_watchman;
 static int daemonized, to_verify = 1;
 
 static void release_index_shm(struct shm *is)
@@ -28,10 +31,21 @@ static void release_index_shm(struct shm *is)
  is->shm = NULL;
 }
 
+static void release_watchman_shm(struct shm *is)
+{
+ if (!is->shm)
+ return;
+ munmap(is->shm, is->size);
+ unlink(git_path("shm-watchman-%s-%" PRIuMAX,
+ sha1_to_hex(is->sha1), (uintmax_t)is->pid));
+ is->shm = NULL;
+}
+
 static void cleanup_shm(void)
 {
  release_index_shm(&shm_index);
  release_index_shm(&shm_base_index);
+ release_watchman_shm(&shm_watchman);
 }
 
 static void cleanup(void)
@@ -166,9 +180,10 @@ static void share_the_index(void)
  if (the_index.split_index && the_index.split_index->base)
  share_index(the_index.split_index->base, &shm_base_index);
  share_index(&the_index, &shm_index);
- if (to_verify && !verify_shm())
+ if (to_verify && !verify_shm()) {
  cleanup_shm();
- discard_index(&the_index);
+ discard_index(&the_index);
+ }
 }
 
 static void set_socket_blocking_flag(int fd, int make_nonblocking)
@@ -201,6 +216,80 @@ static void refresh(void)
 
 #ifndef NO_MMAP
 
+#ifdef USE_WATCHMAN
+static void share_watchman(struct index_state *istate,
+   struct shm *is, pid_t pid)
+{
+ struct strbuf sb = STRBUF_INIT;
+ void *shm;
+
+ write_watchman_ext(&sb, istate);
+ if (!shared_mmap_create(sb.len + 20, &shm,
+ git_path("shm-watchman-%s-%" PRIuMAX,
+ sha1_to_hex(istate->sha1),
+ (uintmax_t)pid))) {
+ is->size = sb.len + 20;
+ is->shm = shm;
+ is->pid = pid;
+ hashcpy(is->sha1, istate->sha1);
+
+ memcpy(shm, sb.buf, sb.len);
+ hashcpy((unsigned char *)shm + is->size - 20, is->sha1);
+ }
+ strbuf_release(&sb);
+}
+
+
+static void prepare_with_watchman(pid_t pid)
+{
+ /*
+ * TODO: with the help of watchman, maybe we could detect if
+ * $GIT_DIR/index is updated.
+ */
+ if (!verify_index(&the_index))
+ refresh();
+
+ if (check_watchman(&the_index))
+ return;
+
+ share_watchman(&the_index, &shm_watchman, pid);
+}
+
+static void prepare_index(pid_t pid)
+{
+ if (shm_index.shm == NULL)
+ refresh();
+ release_watchman_shm(&shm_watchman);
+ if (the_index.last_update)
+ prepare_with_watchman(pid);
+}
+
+#endif
+
+static void reply_to_poke(int client_fd, const char *pid_buf)
+{
+ char *capabilities;
+ struct strbuf sb = STRBUF_INIT;
+
+#ifdef USE_WATCHMAN
+ pid_t client_pid = strtoull(pid_buf, NULL, 10);
+
+ prepare_index(client_pid);
+#endif
+ capabilities = strchr(pid_buf, ' ');
+
+ if (capabilities && !strcmp(capabilities, " watchman"))
+#ifdef USE_WATCHMAN
+ packet_buf_write(&sb, "OK watchman");
+#else
+ packet_buf_write(&sb, "NAK watchman");
+#endif
+ else
+ packet_buf_write(&sb, "OK");
+ if (write_in_full(client_fd, sb.buf, sb.len) != sb.len)
+ warning(_("client write failed"));
+}
+
 static void loop(int fd, int idle_in_seconds)
 {
  assert(idle_in_seconds < INT_MAX / 1000);
@@ -252,11 +341,15 @@ static void loop(int fd, int idle_in_seconds)
  buf[bytes_read] = 0;
  if (!strcmp(buf, "refresh")) {
  refresh();
- } else if (!strcmp(buf, "poke")) {
- /*
- * Just a poke to keep us
- * alive, nothing to do.
- */
+ } else if (starts_with(buf, "poke")) {
+ if (buf[4] == ' ') {
+ reply_to_poke(client_fd, buf + 5);
+ } else {
+ /*
+ * Just a poke to keep us
+ * alive, nothing to do.
+ */
+ }
  } else {
  warning("BUG: Bogus command %s", buf);
  }
diff --git a/read-cache.c b/read-cache.c
index 1719f5a..8ec4be3 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1235,7 +1235,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
  if (!new) {
  const char *fmt;
 
- if (really && cache_errno == EINVAL) {
+ if (really || cache_errno == EINVAL) {
  /* If we are doing --really-refresh that
  * means the index is not valid anymore.
  */
@@ -1375,11 +1375,75 @@ static int verify_hdr(const struct cache_header *hdr, unsigned long size)
  return 0;
 }
 
+static struct untracked_cache_dir *find_untracked_cache_dir(
+ struct untracked_cache *uc, struct untracked_cache_dir *ucd,
+ const char *name)
+{
+ int component_len;
+ const char *end;
+ struct untracked_cache_dir *dir;
+
+ if (!*name)
+ return ucd;
+
+ end = strchr(name, '/');
+ if (end)
+ component_len = end - name;
+ else
+ component_len = strlen(name);
+
+ dir = lookup_untracked(uc, ucd, name, component_len);
+ if (dir)
+ return find_untracked_cache_dir(uc, dir, name + component_len + 1);
+
+ return NULL;
+}
+
 static void mark_no_watchman(size_t pos, void *data)
 {
  struct index_state *istate = data;
+ struct cache_entry *ce = istate->cache[pos];
+ struct strbuf sb = STRBUF_INIT;
+ char *c;
+ struct untracked_cache_dir *dir;
+
  assert(pos < istate->cache_nr);
- istate->cache[pos]->ce_flags |= CE_WATCHMAN_DIRTY;
+ ce->ce_flags |= CE_WATCHMAN_DIRTY;
+
+ if (!istate->untracked || !istate->untracked->root)
+ return;
+
+ strbuf_add(&sb, ce->name, ce_namelen(ce));
+
+ for (c = sb.buf + sb.len - 1; c > sb.buf; c--) {
+ if (*c == '/') {
+ strbuf_setlen(&sb, c - sb.buf);
+ break;
+ }
+ }
+
+ if (c == sb.buf)
+ strbuf_setlen(&sb, 0);
+
+ dir = find_untracked_cache_dir(istate->untracked,
+       istate->untracked->root, sb.buf);
+ if (dir)
+ dir->valid = 0;
+
+ strbuf_release(&sb);
+}
+
+static int mark_untracked_invalid(struct string_list_item *item, void *uc)
+{
+ struct untracked_cache *untracked = uc;
+ struct untracked_cache_dir *dir;
+
+ dir = find_untracked_cache_dir(untracked, untracked->root,
+       item->string);
+ if (dir)
+ dir->valid = 0;
+
+ return 0;
 }
 
 static int read_watchman_ext(struct index_state *istate, const void *data,
@@ -1409,10 +1473,24 @@ static int read_watchman_ext(struct index_state *istate, const void *data,
  ewah_each_bit(bitmap, mark_no_watchman, istate);
  ewah_free(bitmap);
 
- /*
- * TODO: update the untracked cache from the untracked data in this
- * extension.
- */
+ if (istate->untracked && istate->untracked->root) {
+ int i;
+ const char *untracked;
+
+ untracked = (const char *)data + len + 8 + bitmap_size;
+ for (i = 0; i < untracked_nr; ++i) {
+ int len = strlen(untracked);
+ string_list_append(&istate->untracked->invalid_untracked,
+   untracked);
+ untracked += len + 1;
+ }
+
+ for_each_string_list(&istate->untracked->invalid_untracked,
+ mark_untracked_invalid, istate->untracked);
+
+ if (untracked_nr)
+ istate->cache_changed |= WATCHMAN_CHANGED;
+ }
  return 0;
 }
 
@@ -1645,29 +1723,88 @@ static void post_read_index_from(struct index_state *istate)
  tweak_untracked_cache(istate);
 }
 
+/* in ms */
+#define WATCHMAN_TIMEOUT 1000
+
+static int poke_and_wait_for_reply(int fd)
+{
+ int ret = -1;
+ struct pollfd pollfd;
+ int bytes_read;
+ char reply_buf[4096];
+ const char *requested_capabilities = "";
+
+#ifdef USE_WATCHMAN
+ requested_capabilities = "watchman";
+#endif
+
+ if (fd < 0)
+ return -1;
+
+ if (packet_write_gently(fd, "poke %d %s", getpid(), requested_capabilities))
+ return -1;
+ if (packet_flush_gently(fd))
+ return -1;
+
+ /* Now wait for a reply */
+ pollfd.fd = fd;
+ pollfd.events = POLLIN;
+ if (poll(&pollfd, 1, WATCHMAN_TIMEOUT) <= 0)
+ /* No reply or error, giving up */
+ goto done_poke;
+
+ bytes_read = packet_read(fd, NULL, NULL, reply_buf, sizeof(reply_buf),
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_CHOMP_NEWLINE);
+
+ if (bytes_read < 0)
+ goto done_poke;
+
+ if (!strcmp(reply_buf, "NAK watchman")
+#ifdef USE_WATCHMAN
+    || !ends_with(reply_buf, "watchman")
+#endif
+ ) {
+ warning("We requested watchman support from index-helper, but "
+ "it doesn't support it. Please use a version of git "
+ "index-helper with watchman support.");
+ goto done_poke;
+ }
+
+ if (!starts_with(reply_buf, "OK"))
+ goto done_poke;
+
+ ret = 0;
+done_poke:
+ close(fd);
+ return ret;
+}
+
 static int poke_daemon(struct index_state *istate,
        const struct stat *st, int refresh_cache)
 {
  int fd;
- int ret = 0;
- const char *socket_path;
+ int ret = -1;
 
  /* if this is from index-helper, do not poke itself (recursively) */
  if (istate->to_shm)
  return 0;
 
- socket_path = git_path("index-helper.sock");
- if (!socket_path)
+ fd = unix_stream_connect(git_path("index-helper.sock"));
+ if (fd < 0) {
+ warning("Failed to connect to index-helper socket");
+ unlink(git_path("index-helper.sock"));
  return -1;
-
- fd = unix_stream_connect(socket_path);
+ }
  sigchain_push(SIGPIPE, SIG_IGN);
+
  if (refresh_cache) {
  packet_write_gently(fd, "refresh");
+ packet_flush_gently(fd);
+ ret = 0;
  } else {
- packet_write_gently(fd, "poke");
+ ret = poke_and_wait_for_reply(fd);
  }
- packet_flush_gently(fd);
 
  close(fd);
  sigchain_pop(SIGPIPE);
@@ -1737,6 +1874,74 @@ fail:
  return -1;
 }
 
+static void refresh_by_watchman(struct index_state *istate)
+{
+ void *shm = NULL;
+ int length;
+ int i;
+ struct stat st;
+ int fd = -1;
+ const char *path = git_path("shm-watchman-%s-%"PRIuMAX,
+    sha1_to_hex(istate->sha1),
+    (uintmax_t)getpid());
+
+ fd = open(path, O_RDONLY);
+ if (fd < 0)
+ return;
+
+ /*
+ * This watchman data is just for us -- no need to keep it
+ * around once we've got it open.
+ */
+ unlink(path);
+
+ if (fstat(fd, &st) < 0)
+ goto done;
+
+ length = st.st_size;
+ shm = mmap(NULL, length, PROT_READ, MAP_SHARED, fd, 0);
+
+ if (shm == MAP_FAILED)
+ goto done;
+
+ close(fd);
+ fd = -1;
+
+ if (length <= 20 ||
+    hashcmp(istate->sha1, (unsigned char *)shm + length - 20) ||
+    /*
+     * No need to clear CE_WATCHMAN_DIRTY set by 'WAMA' on
+     * disk. Watchman can only set more, not clear any, so
+     * this is OR mask.
+     */
+    read_watchman_ext(istate, shm, length - 20))
+ goto done;
+
+ /*
+ * Now that we've marked the invalid entries in the
+ * untracked-cache itself, we can erase them from the list of
+ * entries to be processed and mark the untracked cache for
+ * watchman usage.
+ */
+ if (istate->untracked) {
+ string_list_clear(&istate->untracked->invalid_untracked, 0);
+ istate->untracked->use_watchman = 1;
+ }
+
+ for (i = 0; i < istate->cache_nr; i++) {
+ struct cache_entry *ce = istate->cache[i];
+ if (ce_stage(ce) || (ce->ce_flags & CE_WATCHMAN_DIRTY))
+ continue;
+ ce_mark_uptodate(ce);
+ }
+done:
+ if (shm)
+ munmap(shm, length);
+
+ if (fd >= 0)
+ close(fd);
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int do_read_index(struct index_state *istate, const char *path, int must_exist)
 {
@@ -1856,7 +2061,7 @@ int read_index_from(struct index_state *istate, const char *path)
  split_index = istate->split_index;
  if (!split_index || is_null_sha1(split_index->base_sha1)) {
  post_read_index_from(istate);
- return ret;
+ goto done;
  }
 
  if (split_index->base)
@@ -1877,6 +2082,10 @@ int read_index_from(struct index_state *istate, const char *path)
     sha1_to_hex(split_index->base->sha1));
  merge_base_index(istate);
  post_read_index_from(istate);
+
+done:
+ if (ret > 0 && istate->from_shm && istate->last_update)
+ refresh_by_watchman(istate);
  return ret;
 }
 
@@ -2178,7 +2387,7 @@ out:
  return 0;
 }
 
-static int verify_index(const struct index_state *istate)
+int verify_index(const struct index_state *istate)
 {
  return verify_index_from(istate, get_index_file());
 }
--
2.4.2.767.g62658d5-twtrsrc

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v10 00/20] index-helper/watchman

Dennis Kaarsemaker
In reply to this post by David Turner
On do, 2016-05-12 at 16:19 -0400, David Turner wrote:

> This version fixes that.  I didn't test on a virtual machine, but I
> did test by adding a sleep().

I can confirm that on my single-cpu test VM, this no longer triggers
errors.

D.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v10 20/20] untracked-cache: config option

Duy Nguyen
In reply to this post by David Turner
On Fri, May 13, 2016 at 3:20 AM, David Turner <[hidden email]> wrote:
> Add a config option to populate the untracked cache.
>
> For installations that have centrally-managed configuration, it's
> easier to set a config once than to run update-index on every
> repository.

This sounds like the job for core.untrackedCache. It populates after
reading the index though (in post_read_index_from) not writing, but I
think it accomplishes the same thing.

> Signed-off-by: David Turner <[hidden email]>
> ---
>  Documentation/config.txt | 4 ++++
>  read-cache.c             | 7 ++++++-
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 385ea66..c7b76ef 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1848,6 +1848,10 @@ imap::
>         The configuration variables in the 'imap' section are described
>         in linkgit:git-imap-send[1].
>
> +index.adduntrackedcache::
> +       Automatically populate the untracked cache whenever the index
> +       is written.
> +
>  index.addwatchmanextension::
>         Automatically add the watchman extension to the index whenever
>         it is written.
> diff --git a/read-cache.c b/read-cache.c
> index 22c64d0..4a1cccf 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2472,7 +2472,7 @@ static int do_write_index(struct index_state *istate, int newfd,
>         int entries = istate->cache_nr;
>         struct stat st;
>         struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
> -       int watchman = 0;
> +       int watchman = 0, untracked = 0;
>         uint64_t start = getnanotime();
>
>         for (i = removed = extended = 0; i < entries; i++) {
> @@ -2502,6 +2502,11 @@ static int do_write_index(struct index_state *istate, int newfd,
>             !the_index.last_update)
>                 the_index.last_update = xstrdup("");
>
> +       if (!git_config_get_bool("index.adduntrackedcache", &untracked) &&
> +           untracked &&
> +           !istate->untracked)
> +               add_untracked_cache(&the_index);
> +
>         hdr_version = istate->version;
>
>         hdr.hdr_signature = htonl(CACHE_SIGNATURE);
> --
> 2.4.2.767.g62658d5-twtrsrc
>



--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v10 20/20] untracked-cache: config option

David Turner
On Sun, 2016-05-15 at 16:43 +0700, Duy Nguyen wrote:

> On Fri, May 13, 2016 at 3:20 AM, David Turner <
> [hidden email]> wrote:
> > Add a config option to populate the untracked cache.
> >
> > For installations that have centrally-managed configuration, it's
> > easier to set a config once than to run update-index on every
> > repository.
>
> This sounds like the job for core.untrackedCache. It populates after
> reading the index though (in post_read_index_from) not writing, but I
> think it accomplishes the same thing.

OK, I'll drop this one.

> > Signed-off-by: David Turner <[hidden email]>
> > ---
> >  Documentation/config.txt | 4 ++++
> >  read-cache.c             | 7 ++++++-
> >  2 files changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/config.txt b/Documentation/config.txt
> > index 385ea66..c7b76ef 100644
> > --- a/Documentation/config.txt
> > +++ b/Documentation/config.txt
> > @@ -1848,6 +1848,10 @@ imap::
> >         The configuration variables in the 'imap' section are
> > described
> >         in linkgit:git-imap-send[1].
> >
> > +index.adduntrackedcache::
> > +       Automatically populate the untracked cache whenever the
> > index
> > +       is written.
> > +
> >  index.addwatchmanextension::
> >         Automatically add the watchman extension to the index
> > whenever
> >         it is written.
> > diff --git a/read-cache.c b/read-cache.c
> > index 22c64d0..4a1cccf 100644
> > --- a/read-cache.c
> > +++ b/read-cache.c
> > @@ -2472,7 +2472,7 @@ static int do_write_index(struct index_state
> > *istate, int newfd,
> >         int entries = istate->cache_nr;
> >         struct stat st;
> >         struct strbuf previous_name_buf = STRBUF_INIT,
> > *previous_name;
> > -       int watchman = 0;
> > +       int watchman = 0, untracked = 0;
> >         uint64_t start = getnanotime();
> >
> >         for (i = removed = extended = 0; i < entries; i++) {
> > @@ -2502,6 +2502,11 @@ static int do_write_index(struct index_state
> > *istate, int newfd,
> >             !the_index.last_update)
> >                 the_index.last_update = xstrdup("");
> >
> > +       if (!git_config_get_bool("index.adduntrackedcache",
> > &untracked) &&
> > +           untracked &&
> > +           !istate->untracked)
> > +               add_untracked_cache(&the_index);
> > +
> >         hdr_version = istate->version;
> >
> >         hdr.hdr_signature = htonl(CACHE_SIGNATURE);
> > --
> > 2.4.2.767.g62658d5-twtrsrc
> >
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v10 00/20] index-helper/watchman

David Turner
In reply to this post by David Turner
Do folks have any more comments on this version? Do I need to re-roll
to replace 11/20 as I proposed and drop 20/20?  

Thanks.

On Thu, 2016-05-12 at 16:19 -0400, David Turner wrote:

> packet_write was causing the sigpipes (by calling write_or_die, which
> intentionally overrides the caller's preferences about signal
> handling).
>
> This version fixes that.  I didn't test on a virtual machine, but I
> did
> test by adding a sleep().
>
> David Turner (9):
>   pkt-line: add gentle version of packet_write
>   index-helper: log warnings
>   unpack-trees: preserve index extensions
>   watchman: add a config option to enable the extension
>   index-helper: kill mode
>   index-helper: don't run if already running
>   index-helper: autorun mode
>   index-helper: optionally automatically run
>   untracked-cache: config option
>
> Nguyễn Thái Ngọc Duy (11):
>   read-cache.c: fix constness of verify_hdr()
>   read-cache: allow to keep mmap'd memory after reading
>   index-helper: new daemon for caching index and related stuff
>   index-helper: add --strict
>   daemonize(): set a flag before exiting the main process
>   index-helper: add --detach
>   read-cache: add watchman 'WAMA' extension
>   watchman: support watchman to reduce index refresh cost
>   index-helper: use watchman to avoid refreshing index with lstat()
>   update-index: enable/disable watchman support
>   trace: measure where the time is spent in the index-heavy
> operations
>
>  .gitignore                               |   2 +
>  Documentation/config.txt                 |  12 +
>  Documentation/git-index-helper.txt       |  81 +++++
>  Documentation/git-update-index.txt       |   6 +
>  Documentation/technical/index-format.txt |  22 ++
>  Makefile                                 |  18 ++
>  builtin/gc.c                             |   2 +-
>  builtin/update-index.c                   |  16 +
>  cache.h                                  |  25 +-
>  config.c                                 |   5 +
>  configure.ac                             |   8 +
>  contrib/completion/git-completion.bash   |   1 +
>  daemon.c                                 |   2 +-
>  diff-lib.c                               |   4 +
>  dir.c                                    |  25 +-
>  dir.h                                    |   6 +
>  environment.c                            |   3 +
>  git-compat-util.h                        |   1 +
>  index-helper.c                           | 486
> ++++++++++++++++++++++++++++
>  name-hash.c                              |   2 +
>  pkt-line.c                               |  18 ++
>  pkt-line.h                               |   2 +
>  preload-index.c                          |   2 +
>  read-cache.c                             | 536
> ++++++++++++++++++++++++++++++-
>  refs/files-backend.c                     |   2 +
>  setup.c                                  |   4 +-
>  t/t1701-watchman-extension.sh            |  37 +++
>  t/t7063-status-untracked-cache.sh        |  22 ++
>  t/t7900-index-helper.sh                  |  69 ++++
>  t/test-lib-functions.sh                  |   4 +
>  test-dump-watchman.c                     |  16 +
>  unpack-trees.c                           |   1 +
>  watchman-support.c                       | 135 ++++++++
>  watchman-support.h                       |   7 +
>  34 files changed, 1561 insertions(+), 21 deletions(-)
>  create mode 100644 Documentation/git-index-helper.txt
>  create mode 100644 index-helper.c
>  create mode 100755 t/t1701-watchman-extension.sh
>  create mode 100755 t/t7900-index-helper.sh
>  create mode 100644 test-dump-watchman.c
>  create mode 100644 watchman-support.c
>  create mode 100644 watchman-support.h
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v10 00/20] index-helper/watchman

Junio C Hamano
David Turner <[hidden email]> writes:

> Do folks have any more comments on this version?

Not from me at the moment.

> Do I need to re-roll
> to replace 11/20 as I proposed and drop 20/20?  

FYI, I think I have the one taken from

    Message-Id:
    <[hidden email]>

aka http://thread.gmane.org/gmane.comp.version-control.git/294470/focus=294585

queued at 11/20.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v10 00/20] index-helper/watchman

David Turner
On Thu, 2016-05-19 at 13:11 -0700, Junio C Hamano wrote:

> David Turner <[hidden email]> writes:
>
> > Do folks have any more comments on this version?
>
> Not from me at the moment.
>
> > Do I need to re-roll
> > to replace 11/20 as I proposed and drop 20/20?  
>
> FYI, I think I have the one taken from
>
>     Message-Id:
>     <[hidden email]>
>
> aka http://thread.gmane.org/gmane.comp.version-control.git/294470/foc
> us=294585
>
> queued at 11/20.

LGTM!

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
12