bug-global
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RFC] 1 pass architecture and sorted writing.


From: Shigio YAMAGUCHI
Subject: [RFC] 1 pass architecture and sorted writing.
Date: Thu, 13 May 2010 18:44:12 +0900

Hi all,
I would like to add two changes to GLOBAL.
There is no change in the external specification excluding not using GSYMS.

1. Purpose
==========

o Decrease in complexity of gtags(1).
o Improvement of execution speed of gtags(1).

2. Changes
==========

(1) 1 pass architecture (2 pass -> 1 pass)
        GLOBAL doesn't use GSYMS any longer. Symbols stored there before are
        put into GRTAGS instead. The divergence to GRTAGS and GSYMS is done
        in referring stage (gtags_first() and gtags_next()) not in generating
        stage. This change removes complexity from the program very much.
 
(2) Sorted writing
        The (1) brings big GRTAGS and decrease at the writing speed.
        To avoid it, the records written to the tag files are sorted beforehand.
        This function becomes effective if DBOP_SORTED_WRITE flag is specified
        for function dbop_open(). It uses external POSIX sort command.

Format version is incremented by 1 (version 6).

[Current(global-5.8.2)]

        Pass 1:
                                     +-------+ 
          source files =============>| gtags |============> GTAGS
                                     |       |============> GPATH
                                     |       | tag record
                                     |       |============> temporary file
                                     +-------+
        Pass 2:
                         tag record  +-------+
         temporary file ============>| gtags |============> GRTAGS
                                     |       |
                  GTAGS ============>|       |============> GSYMS
                                     +-------+
[New]

        Pass 1:
                                     +-------+ 
          source files =============>| gtags |============> GPATH
                                     |       |==>[sort]===> GTAGS
                                     |       |==>[sort]===> GRTAGS
                                     +-------+


3. Performance
==============

The change in the performance is as follows.
These CPU time doesn't include the execution time of sort(1).

(1) Creating tag files

[Linux-2.6.32 source code (387MB)]

current                 176.07 real        43.79 user         8.57 sys
new                      81.13 real        35.62 user         6.82 sys

[FreeBSD kernel source code (104MB)]]

current                  12.99 real         7.64 user         0.99 sys
new                       8.57 real         6.74 user         1.02 sys

(2) Reference to tag files

[Linux-2.6.32 source code (387MB)]

o global -x ^set
current                 0.04 real         0.01 user         0.02 sys
new                     0.04 real         0.03 user         0.00 sys

o global -xr ^set
current                 2.81 real         2.05 user         0.60 sys
new                     2.85 real         2.04 user         0.63 sys

o global -xs ^set
current                 1.17 real         0.80 user         0.27 sys
new                     1.16 real         0.81 user         0.28 sys

o global -fr kernel/*.c
current                 0.61 real         0.47 user         0.10 sys
new                     0.60 real         0.43 user         0.12 sys


4. Left-off cargo
=================

These will be solved later.

o Changing htags(1) is needed too.
o Treatment of command layer parser.

Any comment?

This is the patch.

Index: gtags/gtags.c
===================================================================
RCS file: /sources/global/global/gtags/gtags.c,v
retrieving revision 1.234
diff -c -r1.234 gtags.c
*** gtags/gtags.c       25 Mar 2010 12:10:49 -0000      1.234
--- gtags/gtags.c       13 May 2010 09:30:42 -0000
***************
*** 459,487 ****
                 */
                if (!test("f", makepath(dbpath, dbname(GPATH), NULL)))
                        die("Old version tag file found. Please remake it.");
-               /*
-                * The following restrictions are in incremental updating with
-                * built-in parser:
-                * If GRTAGS or GSYMS exists, both of them exist and the format
-                * should be the same.
-                */
-               if (!use_command_parser) {
-                       int format_r, format_s;
- 
-                       if (test("f", makepath(dbpath, dbname(GRTAGS), NULL)))
-                               format_r = peek_dbformat(dbpath, cwd, GRTAGS);
-                       else
-                               format_r = -1;
-                       if (test("f", makepath(dbpath, dbname(GSYMS), NULL)))
-                               format_s = peek_dbformat(dbpath, cwd, GSYMS);
-                       else
-                               format_s = -1;
-                       if (format_r != format_s)
-                               die("%s\n  Please invoke gtags again without 
the -i option.",
-                                       (format_r == -1) ? "GRTAGS doesn't 
exist though GSYMS exists."
-                                       : (format_s == -1) ? "GSYMS doesn't 
exist though GRTAGS exists."
-                                       : "The format of GRTAGS and GSYMS is 
different.");
-               }
                (void)incremental(dbpath, cwd);
                exit(0);
        }
--- 459,464 ----
***************
*** 1050,1062 ****
  updatetags_using_builtin_parser(const char *dbpath, const char *root, IDSET 
*deleteset, STRBUF *addlist)
  {
        struct put_func_data data;
!       int seqno;
        const char *path, *start, *end;
-       FILE *tmp;
  
        if (vflag)
!               fprintf(stderr, "[%s] Updating '%s'.\n", now(), dbname(GTAGS));
        data.gtop[GTAGS] = gtags_open(dbpath, root, GTAGS, GTAGS_MODIFY, 0);
        /*
         * Delete tags from GTAGS.
         */
--- 1027,1050 ----
  updatetags_using_builtin_parser(const char *dbpath, const char *root, IDSET 
*deleteset, STRBUF *addlist)
  {
        struct put_func_data data;
!       int seqno, flags;
        const char *path, *start, *end;
  
        if (vflag)
!               fprintf(stderr, "[%s] Updating '%s and '%s''.\n", now(), 
dbname(GTAGS), dbname(GRTAGS));
!       /*
!        * Open tag files.
!        */
        data.gtop[GTAGS] = gtags_open(dbpath, root, GTAGS, GTAGS_MODIFY, 0);
+       if (test("f", makepath(dbpath, dbname(GRTAGS), NULL))) {
+               data.gtop[GRTAGS] = gtags_open(dbpath, root, GRTAGS, 
GTAGS_MODIFY, 0);
+       } else {
+               /*
+                * If you set NULL to data.gtop[GRTAGS], parse_file() doesn't 
write to
+                * GRTAGS. See put_syms().
+                */
+               data.gtop[GRTAGS] = NULL;
+       }
        /*
         * Delete tags from GTAGS.
         */
***************
*** 1076,1108 ****
                        }
                }
                gtags_delete(data.gtop[GTAGS], deleteset);
        }
!       data.gtop[GTAGS]->flags = 0;
        if (extractmethod)
!               data.gtop[GTAGS]->flags |= GTAGS_EXTRACTMETHOD;
        if (debug)
!               data.gtop[GTAGS]->flags |= GTAGS_DEBUG;
!       if (test("f", makepath(dbpath, dbname(GRTAGS), NULL))) {
!               data.gtop[GRTAGS] = gtags_open(dbpath, root, GRTAGS, 
GTAGS_MODIFY, 0);
!               data.gtop[GRTAGS]->flags = data.gtop[GTAGS]->flags;
!               /*
!                * If you set file pointer to tmpfile_fp, gtags_put() and 
gtags_put_using()
!                * write records to the temporary file instead of db(3) file.
!                */
!               tmp = tmpfile();
!               if (tmp == NULL)
!                       die("cannot make temporary file.");
!               data.gtop[GRTAGS]->tmpfile_fp = tmp;
!       } else {
!               /*
!                * If you set NULL to data.gtop[GRTAGS], parse_file() doesn't 
write to
!                * GRTAGS. See put_syms().
!                */
!               data.gtop[GRTAGS] = NULL;
!               tmp = NULL;
!       }
        /*
!        * Add tags to GTAGS and the temporary file.
         */
        start = strbuf_value(addlist);
        end = start + strbuf_getlen(addlist);
--- 1064,1083 ----
                        }
                }
                gtags_delete(data.gtop[GTAGS], deleteset);
+               if (data.gtop[GRTAGS] != NULL)
+                       gtags_delete(data.gtop[GRTAGS], deleteset);
        }
!       /*
!        * Set flags.
!        */
!       flags = 0;
        if (extractmethod)
!               flags |= GTAGS_EXTRACTMETHOD;
        if (debug)
!               flags |= GTAGS_DEBUG;
!       data.gtop[GTAGS]->flags = data.gtop[GRTAGS]->flags = flags;
        /*
!        * Add tags to GTAGS and GRTAGS.
         */
        start = strbuf_value(addlist);
        end = start + strbuf_getlen(addlist);
***************
*** 1120,1186 ****
                        gtags_flush(data.gtop[GRTAGS], data.fid);
        }
        parser_exit();
-       if (data.gtop[GRTAGS] == NULL) {
-               gtags_close(data.gtop[GTAGS]);
-               return;
-       }
- 
-       if (vflag)
-               fprintf(stderr, "[%s] Updating '%s' and '%s'.\n", now(), 
dbname(GRTAGS), dbname(GSYMS));
-       data.gtop[GRTAGS]->tmpfile_fp = NULL;
-       data.gtop[GSYMS] = gtags_open(dbpath, root, GSYMS, GTAGS_MODIFY, 0);
-       /*
-        * Delete tags from GRTAGS and GSYMS.
-        */
-       if (!idset_empty(deleteset)) {
-               if (vflag) {
-                       char fid[32];
-                       int total = idset_count(deleteset);
-                       unsigned int id;
- 
-                       seqno = 1;
-                       for (id = idset_first(deleteset); id != END_OF_ID; id = 
idset_next(deleteset)) {
-                               snprintf(fid, sizeof(fid), "%d", id);
-                               path = gpath_fid2path(fid, NULL);
-                               if (path == NULL)
-                                       die("GPATH is corrupted.");
-                               fprintf(stderr, " [%d/%d] deleting tags of 
%s\n", seqno++, total, path + 2);
-                       }
-               }
-               gtags_delete(data.gtop[GRTAGS], deleteset);
-               gtags_delete(data.gtop[GSYMS], deleteset);
-       }
-       /*
-        * Move tags between GRTAGS and GSYMS.
-        *
-        * GRTAGS if not defined in GTAGS       ===> GSYMS
-        * GSYMS if defined in GTAGS            ===> GRTAGS
-        */
-       if (vflag) {
-               fprintf(stderr, " moving undefined symbols from '%s' to 
'%s'.\n", dbname(GRTAGS), dbname(GSYMS));
-               fprintf(stderr, " moving defined symbols from '%s' to '%s'.\n", 
dbname(GSYMS), dbname(GRTAGS));
-       }
-       gtags_move_ref_sym(data.gtop);
-       /*
-        * Add tags to GRTAGS and GSYMS.
-        *
-        * temporary file ===>  if defined in GTAGS     ===> GRTAGS
-        *                      else                    ===> GSYMS
-        *
-        * Gtags always makes GRTAGS and GSYMS even if they are empty.
-        */
-       if (ftell(tmp) > 0) {
-               if (vflag) {
-                       seqno = 0;
-                       for (path = start; path < end; path += strlen(path) + 1)
-                               fprintf(stderr, " [%d/%d] adding tags of %s\n", 
++seqno, total, path + 2);
-               }
-               gtags_add_ref_sym(data.gtop, tmp);
-       }
-       fclose(tmp);
        gtags_close(data.gtop[GTAGS]);
!       gtags_close(data.gtop[GRTAGS]);
!       gtags_close(data.gtop[GSYMS]);
  }
  /*
   * createtags_using_builtin_parser: create tags file
--- 1095,1103 ----
                        gtags_flush(data.gtop[GRTAGS], data.fid);
        }
        parser_exit();
        gtags_close(data.gtop[GTAGS]);
!       if (data.gtop[GRTAGS] != NULL)
!               gtags_close(data.gtop[GRTAGS]);
  }
  /*
   * createtags_using_builtin_parser: create tags file
***************
*** 1196,1209 ****
        struct put_func_data data;
        int openflags, seqno;
        const char *path;
-       FILE *tmp;
  
!       tim = statistics_time_start("Time of creating %s and temporary file", 
dbname(GTAGS));
        if (vflag)
!               fprintf(stderr, "[%s] Creating '%s' and temporary file.\n", 
now(), dbname(GTAGS));
!       tmp = tmpfile();
!       if (tmp == NULL)
!               die("cannot make temporary file.");
        openflags = cflag ? GTAGS_COMPACT : 0;
        data.gtop[GTAGS] = gtags_open(dbpath, root, GTAGS, GTAGS_CREATE, 
openflags);
        data.gtop[GTAGS]->flags = 0;
--- 1113,1122 ----
        struct put_func_data data;
        int openflags, seqno;
        const char *path;
  
!       tim = statistics_time_start("Time of creating %s and %s.", 
dbname(GTAGS), dbname(GRTAGS));
        if (vflag)
!               fprintf(stderr, "[%s] Creating '%s' and '%s'.\n", now(), 
dbname(GTAGS), dbname(GRTAGS));
        openflags = cflag ? GTAGS_COMPACT : 0;
        data.gtop[GTAGS] = gtags_open(dbpath, root, GTAGS, GTAGS_CREATE, 
openflags);
        data.gtop[GTAGS]->flags = 0;
***************
*** 1214,1224 ****
        data.gtop[GRTAGS] = gtags_open(dbpath, root, GRTAGS, GTAGS_CREATE, 
openflags);
        data.gtop[GRTAGS]->flags = data.gtop[GTAGS]->flags;
        /*
-        * If you set file pointer to tmpfile_fp, gtags_put() and 
gtags_put_using()
-        * write records to the temporary file instead of db(3) file.
-        */
-       data.gtop[GRTAGS]->tmpfile_fp = tmp;
-       /*
         * Add tags to GTAGS and the temporary file.
         */
        if (file_list)
--- 1127,1132 ----
***************
*** 1247,1253 ****
        total = seqno;
        parser_exit();
        find_close();
-       data.gtop[GRTAGS]->tmpfile_fp = NULL;
        statistics_time_end(tim);
        strbuf_reset(sb);
        if (getconfs("GTAGS_extra", sb)) {
--- 1155,1160 ----
***************
*** 1264,1304 ****
                        die("%s not found.", dbname(GTAGS));
                statistics_time_end(tim);
        }
- 
-       tim = statistics_time_start("Time of creating %s and %s", 
dbname(GRTAGS), dbname(GSYMS));
-       if (vflag)
-               fprintf(stderr, "[%s] Creating '%s' and '%s'.\n", now(), 
dbname(GRTAGS), dbname(GSYMS));
-       data.gtop[GSYMS] = gtags_open(dbpath, root, GSYMS, GTAGS_CREATE, 
openflags);
-       /*
-        * Add tags to GRTAGS and GSYMS.
-        *
-        * temporary file ===>  if defined in GTAGS     ===> GRTAGS
-        *                      else                    ===> GSYMS
-        *
-        * Gtags always makes GRTAGS and GSYMS even if they are empty.
-        */
-       if (ftell(tmp) > 0) {
-               if (vflag) {
-                       if (file_list)
-                               find_open_filelist(file_list, root);
-                       else
-                               find_open(NULL);
-                       seqno = 0;
-                       while ((path = find_read()) != NULL) {
-                               if (*path == ' ')
-                                       continue;
-                               fprintf(stderr, " [%d/%d] adding tags of %s\n", 
++seqno, total, path + 2);
-                       }
-                       find_close();
-               }
-               gtags_add_ref_sym(data.gtop, tmp);
-       }
-       fclose(tmp);
        statistics_time_end(tim);
        tim = statistics_time_start("Time of flushing B-tree cache");
        gtags_close(data.gtop[GTAGS]);
        gtags_close(data.gtop[GRTAGS]);
-       gtags_close(data.gtop[GSYMS]);
        statistics_time_end(tim);
        strbuf_reset(sb);
        if (getconfs("GRTAGS_extra", sb)) {
--- 1171,1180 ----
***************
*** 1314,1320 ****
                        fprintf(stderr, "GSYMS_extra command failed: %s\n", 
strbuf_value(sb));
                statistics_time_end(tim);
        }
- 
        strbuf_close(sb);
  }
  /*
--- 1190,1195 ----
Index: libutil/dbop.c
===================================================================
RCS file: /sources/global/global/libutil/dbop.c,v
retrieving revision 1.44
diff -c -r1.44 dbop.c
*** libutil/dbop.c      9 Mar 2010 03:27:18 -0000       1.44
--- libutil/dbop.c      13 May 2010 09:30:42 -0000
***************
*** 39,44 ****
--- 39,45 ----
  #ifdef HAVE_UNISTD_H
  #include <unistd.h>
  #endif
+ #include <errno.h>
  
  #include "char.h"
  #include "checkalloc.h"
***************
*** 52,57 ****
--- 53,125 ----
  #define ismeta(p)     (*((char *)(p)) == ' ')
  
  /*
+  * This should be autoconfed.
+  */
+ #define POSIX_SORT "/usr/bin/sort"
+ 
+ static char *argv[] = {
+       POSIX_SORT,
+       "-k",
+       "1,2",
+       NULL
+ };
+ 
+ static void start_sort_process(DBOP *);
+ static void terminate_sort_process(DBOP *);
+ 
+ /*
+  * start_sort_process: start sort process for sorted writing
+  *
+  *    i)      dbop    DBOP descriptor
+  */
+ static void
+ start_sort_process(DBOP *dbop) {
+       int opipe[2], ipipe[2];
+ 
+       if (pipe(opipe) < 0 || pipe(ipipe) < 0)
+               fprintf(stderr, "cannot create pipe.");
+       dbop->pid = fork();
+       /*
+        * Setup pipe for two way communication
+        *
+        *      Parent(gtags)           Child(sort)
+        *      -----------------------------------------
+        *      opipe[1] =============> opipe[0] (stdin)
+        *      ipipe[0] <============  ipipe[1] (stdout)
+        */
+       if (dbop->pid == 0) {
+               /* child process */
+               close(opipe[1]);
+               close(ipipe[0]);
+               if (dup2(opipe[0], 0) < 0 || dup2(ipipe[1], 1) < 0)
+                       die("dup2 failed.");
+               close(opipe[0]);
+               close(ipipe[1]);
+               execvp(POSIX_SORT, argv);
+       } else if (dbop->pid < 0)
+               die("fork failed.");
+       /* parent process */
+       close(opipe[0]);
+       close(ipipe[1]);
+       fcntl(ipipe[0], F_SETFD, FD_CLOEXEC);
+       fcntl(opipe[1], F_SETFD, FD_CLOEXEC);
+       dbop->sortout = fdopen(opipe[1], "w");
+       dbop->sortin = fdopen(ipipe[0], "r");
+       if (dbop->sortout == NULL || dbop->sortin == NULL)
+               die("fdopen failed.");
+ }
+ /*
+  * terminate_sort_process: terminate sort process
+  *
+  *    i)      dbop    DBOP descriptor
+  */
+ static void
+ terminate_sort_process(DBOP *dbop) {
+       while (waitpid(dbop->pid, NULL, 0) < 0 && errno == EINTR)
+               ;
+ }
+ 
+ /*
   * dbop_open: open db database.
   *
   *    i)      path    database name
***************
*** 59,64 ****
--- 127,133 ----
   *    i)      perm    file permission
   *    i)      flags
   *                    DBOP_DUP: allow duplicate records.
+  *                    DBOP_SORTED_WRITE: use sorted writing. This requires 
POSIX sort.
   *    r)              descripter for dbop_xxx()
   */
  DBOP *
***************
*** 118,124 ****
        dbop->perm      = (mode == 1) ? perm : 0;
        dbop->lastdat   = NULL;
        dbop->lastsize  = 0;
! 
        return dbop;
  }
  /*
--- 187,199 ----
        dbop->perm      = (mode == 1) ? perm : 0;
        dbop->lastdat   = NULL;
        dbop->lastsize  = 0;
!       dbop->sortout   = NULL;
!       dbop->sortin    = NULL;
!       /*
!        * Setup sorted writing.
!        */
!       if (dbop->openflags & DBOP_SORTED_WRITE)
!               start_sort_process(dbop);
        return dbop;
  }
  /*
***************
*** 170,175 ****
--- 245,258 ----
                die("primary key size == 0.");
        if (len > MAXKEYLEN)
                die("primary key too long.");
+       /* sorted writing */
+       if (dbop->sortout != NULL) {
+               fputs(name, dbop->sortout);
+               putc('\t', dbop->sortout);
+               fputs(data, dbop->sortout);
+               putc('\n', dbop->sortout);
+               return;
+       }
        key.data = (char *)name;
        key.size = strlen(name)+1;
        dat.data = (char *)data;
***************
*** 204,209 ****
--- 287,300 ----
                die("primary key size == 0.");
        if (len > MAXKEYLEN)
                die("primary key too long.");
+       /* sorted writing */
+       if (dbop->sortout != NULL) {
+               fputs(name, dbop->sortout);
+               putc('\t', dbop->sortout);
+               fputs(data, dbop->sortout);
+               putc('\n', dbop->sortout);
+               return;
+       }
        key.data = (char *)name;
        key.size = strlen(name)+1;
        dat.data = (char *)data;
***************
*** 501,506 ****
--- 592,628 ----
  {
        DB *db = dbop->db;
  
+       /*
+        * Load sorted tag records and write them to the tag file.
+        */
+       if (dbop->sortout != NULL) {
+               STRBUF *sb = strbuf_open(256);
+               char *p;
+ 
+               /*
+                * End of the former stage of sorted writing.
+                * fclose() and sortout = NULL is important.
+                *
+                * fclose(): enables reading from sortin descriptor.
+                * sortout = NULL: makes the following dbop_put write to the 
tag file directly.
+                */
+               fclose(dbop->sortout);
+               dbop->sortout = NULL;
+               /*
+                * The last stage of sorted writing.
+                */
+               while (strbuf_fgets(sb, dbop->sortin, STRBUF_NOCRLF)) {
+                       for (p = strbuf_value(sb); *p && *p != '\t'; p++)
+                               ;
+                       if (!*p)
+                               die("unexpected end of record.");
+                       *p++ = '\0';
+                       dbop_put(dbop, strbuf_value(sb), p);
+               }
+               fclose(dbop->sortin);
+               strbuf_close(sb);
+               terminate_sort_process(dbop);
+       }
  #ifdef USE_DB185_COMPAT
        (void)db->close(db);
  #else
Index: libutil/dbop.h
===================================================================
RCS file: /sources/global/global/libutil/dbop.h,v
retrieving revision 1.27
diff -c -r1.27 dbop.h
*** libutil/dbop.h      9 Mar 2010 03:27:18 -0000       1.27
--- libutil/dbop.h      13 May 2010 09:30:42 -0000
***************
*** 56,61 ****
--- 56,67 ----
        int keylen;                     /* key length */
        char prev[MAXKEYLEN+1];         /* previous key value */
        int perm;                       /* file permission */
+       /*
+        * (3) sorted write
+        */
+       FILE *sortout;                  /* write to sort command */
+       FILE *sortin;                   /* read from sort command */
+       int pid;                        /* sort process id */
  } DBOP;
  
  /*
***************
*** 65,73 ****
  /*
   * ioflags
   */
! #define DBOP_KEY      1               /* read key part                */
! #define DBOP_PREFIX   2               /* prefixed read                */
! #define DBOP_RAW      4               /* raw read                     */
  
  DBOP *dbop_open(const char *, int, int, int);
  const char *dbop_get(DBOP *, const char *);
--- 71,80 ----
  /*
   * ioflags
   */
! #define DBOP_KEY              1       /* read key part                */
! #define DBOP_PREFIX           2       /* prefixed read                */
! #define DBOP_RAW              4       /* raw read                     */
! #define DBOP_SORTED_WRITE     8       /* sorted write                 */
  
  DBOP *dbop_open(const char *, int, int, int);
  const char *dbop_get(DBOP *, const char *);
Index: libutil/gtagsop.c
===================================================================
RCS file: /sources/global/global/libutil/gtagsop.c,v
retrieving revision 1.118
diff -c -r1.118 gtagsop.c
*** libutil/gtagsop.c   5 Mar 2010 23:47:44 -0000       1.118
--- libutil/gtagsop.c   13 May 2010 09:30:42 -0000
***************
*** 62,67 ****
--- 62,68 ----
  static int compare_lineno(const void *, const void *);
  static int compare_tags(const void *, const void *);
  static const char *seekto(const char *, int);
+ static int is_defined_in_GTAGS(GTOP *, const char *);
  static void flush_pool(GTOP *, const char *);
  static void segment_read(GTOP *);
  
***************
*** 122,147 ****
        return p;
  }
  /*
-  * tmpfile_put: write to the temporary file instead of db(3) file.
-  *
-  *    i)      fp      file pointer
-  *    i)      key     key of db(3)
-  *    i)      data    data of db(3)
-  *
-  * record format:
-  * <key>\t<data>\n
-  *
-  * Finally, this record should be written in db(3) file.
-  */
- void
- tmpfile_put(FILE *fp, const char *key, const char *data)
- {
-       fputs(key, fp);
-       putc('\t', fp);
-       fputs(data, fp);
-       putc('\n', fp);
- }
- /*
   * Tag format
   *
   * [Specification of format version 4]
--- 123,128 ----
***************
*** 261,270 ****
   *       $ global -x main
   *       GTAGS seems older format. Please remake tag files.
   */
! static int upper_bound_version = 5;   /* acceptable format version (upper 
bound) */
! static int lower_bound_version = 4;   /* acceptable format version (lower 
bound) */
  static const char *const tagslist[] = {"GPATH", "GTAGS", "GRTAGS", "GSYMS"};
  /*
   * dbname: return db name
   *
   *    i)      db      0: GPATH, 1: GTAGS, 2: GRTAGS, 3: GSYMS
--- 242,291 ----
   *       $ global -x main
   *       GTAGS seems older format. Please remake tag files.
   */
! static int new_format_version = 6;    /* new format version */
! static int upper_bound_version = 6;   /* acceptable format version (upper 
bound) */
! static int lower_bound_version = 6;   /* acceptable format version (lower 
bound) */
  static const char *const tagslist[] = {"GPATH", "GTAGS", "GRTAGS", "GSYMS"};
  /*
+  * Virtual GRTAGS, GSYMS processing:
+  *
+  * We use a real GRTAGS as virtual GRTAGS and GSYMS.
+  * In fact, GSYMS tag file doesn't exist.
+  *
+  * Real tag file      virtual tag file
+  * --------------------------------------
+  * GTAGS =============> GTAGS
+  *
+  * GRTAGS ====+=======> GRTAGS        tags which is defined in GTAGS
+  *            +=======> GSYMS tags which is not defined in GTAGS
+  */
+ #define VIRTUAL_GRTAGS_GSYMS_PROCESSING(gtop)                                 
                \
+       if (gtop->db != GTAGS) {                                                
        \
+               int defined = is_defined_in_GTAGS(gtop, gtop->dbop->lastkey);   
        \
+               if (gtop->db == GRTAGS && !defined || gtop->db == GSYMS && 
defined)     \
+                       continue;                                               
        \
+       }
+ /*
+  * is_defined_in_GTAGS: whether or not the name is defined in GTAGS.
+  *
+  *    i)      gtop
+  *    i)      name    tag name
+  *    r)              0: not defined, 1: defined
+  *
+  * It is assumed that the input stream is sorted by the tag name.
+  */
+ static int
+ is_defined_in_GTAGS(GTOP *gtop, const char *name)
+ {
+       static char prev_name[MAXTOKEN+1];
+       static int prev_result;
+ 
+       if (!strcmp(name, prev_name))
+               return prev_result;
+       strlimcpy(prev_name, name, sizeof(prev_name));
+       return prev_result = dbop_get(gtop->gtags, prev_name) ? 1 : 0;
+ }
+ /*
   * dbname: return db name
   *
   *    i)      db      0: GPATH, 1: GTAGS, 2: GRTAGS, 3: GSYMS
***************
*** 294,306 ****
  gtags_open(const char *dbpath, const char *root, int db, int mode, int flags)
  {
        GTOP *gtop;
        int dbmode;
  
        gtop = (GTOP *)check_calloc(sizeof(GTOP), 1);
        gtop->db = db;
        gtop->mode = mode;
        gtop->openflags = flags;
-       gtop->format_version = 4;
        /*
         * Open tag file allowing duplicate records.
         */
--- 315,327 ----
  gtags_open(const char *dbpath, const char *root, int db, int mode, int flags)
  {
        GTOP *gtop;
+       char tagfile[MAXPATHLEN+1];
        int dbmode;
  
        gtop = (GTOP *)check_calloc(sizeof(GTOP), 1);
        gtop->db = db;
        gtop->mode = mode;
        gtop->openflags = flags;
        /*
         * Open tag file allowing duplicate records.
         */
***************
*** 317,334 ****
        default:
                assert(0);
        }
!       gtop->dbop = dbop_open(makepath(dbpath, dbname(db), NULL), dbmode, 
0644, DBOP_DUP);
        if (gtop->dbop == NULL) {
                if (dbmode == 1)
                        die("cannot make %s.", dbname(db));
                die("%s not found.", dbname(db));
        }
        if (gtop->mode == GTAGS_CREATE) {
                /*
                 * Decide format.
                 */
                gtop->format = 0;
!               gtop->format_version = 5;
                /*
                 * GRTAGS and GSYSM always use compact format.
                 * GTAGS uses compact format only when the -c option specified.
--- 338,376 ----
        default:
                assert(0);
        }
!       /*
!        * GRTAGS and GSYMS are virtual tag file. They are included in a real 
GRTAGS file.
!        * In fact, GSYMS doesn't exist now.
!        *
!        * GRTAGS:      tags which belongs to GRTAGS, and are defined in GTAGS.
!        * GSYMS:       tags which belongs to GRTAGS, and is not defined in 
GTAGS.
!        */
!       strlimcpy(tagfile, makepath(dbpath, dbname(db == GSYMS ? GRTAGS : db), 
NULL), sizeof(tagfile));
!       gtop->dbop = dbop_open(tagfile, dbmode, 0644, 
DBOP_DUP|DBOP_SORTED_WRITE);
        if (gtop->dbop == NULL) {
                if (dbmode == 1)
                        die("cannot make %s.", dbname(db));
                die("%s not found.", dbname(db));
        }
+       if (gtop->mode == GTAGS_READ && db != GTAGS) {
+               const char *gtags = makepath(dbpath, dbname(GTAGS), NULL);
+               int format_version;
+ 
+               gtop->gtags = dbop_open(gtags, 0, 0, 0);
+               if (gtop->gtags == NULL)
+                       die("GTAGS not found.");
+               format_version = dbop_getversion(gtop->dbop);
+               if (format_version > upper_bound_version)
+                       die("%s seems new format. Please install the latest 
GLOBAL.", gtags);
+               else if (format_version < lower_bound_version)
+                       die("%s seems older format. Please remake tag files.", 
gtags);
+       }
        if (gtop->mode == GTAGS_CREATE) {
                /*
                 * Decide format.
                 */
                gtop->format = 0;
!               gtop->format_version = new_format_version;
                /*
                 * GRTAGS and GSYSM always use compact format.
                 * GTAGS uses compact format only when the -c option specified.
***************
*** 365,373 ****
                 */
                gtop->format_version = dbop_getversion(gtop->dbop);
                if (gtop->format_version > upper_bound_version)
!                       die("%s seems new format. Please install the latest 
GLOBAL.", dbname(gtop->db));
                else if (gtop->format_version < lower_bound_version)
!                       die("%s seems older format. Please remake tag files.", 
dbname(gtop->db));
                gtop->format = 0;
                if (dbop_getoption(gtop->dbop, COMPACTKEY) != NULL)
                        gtop->format |= GTAGS_COMPACT;
--- 407,415 ----
                 */
                gtop->format_version = dbop_getversion(gtop->dbop);
                if (gtop->format_version > upper_bound_version)
!                       die("%s seems new format. Please install the latest 
GLOBAL.", tagfile);
                else if (gtop->format_version < lower_bound_version)
!                       die("%s seems older format. Please remake tag files.", 
tagfile);
                gtop->format = 0;
                if (dbop_getoption(gtop->dbop, COMPACTKEY) != NULL)
                        gtop->format |= GTAGS_COMPACT;
***************
*** 459,469 ****
        strbuf_putn(gtop->sb, lno);
        strbuf_putc(gtop->sb, ' ');
        strbuf_puts(gtop->sb, (gtop->format & GTAGS_COMPRESS) ? compress(img, 
key) : img);
!       if (gtop->tmpfile_fp != NULL) {
!               tmpfile_put(gtop->tmpfile_fp, key, strbuf_value(gtop->sb));
!       } else {
!               dbop_put(gtop->dbop, key, strbuf_value(gtop->sb));
!       }
  }
  /*
   * gtags_flush: Flush the pool for compact format.
--- 501,507 ----
        strbuf_putn(gtop->sb, lno);
        strbuf_putc(gtop->sb, ' ');
        strbuf_puts(gtop->sb, (gtop->format & GTAGS_COMPRESS) ? compress(img, 
key) : img);
!       dbop_put(gtop->dbop, key, strbuf_value(gtop->sb));
  }
  /*
   * gtags_flush: Flush the pool for compact format.
***************
*** 480,582 ****
        }
  }
  /*
-  * gtags_add_ref_sym: read candidate of reference and other symbol from 
temporary file,
-  *                    and put into GRTAGS or GSYMS.
-  *
-  *    i) gtop         array of descripter of GTOP
-  *    i) ip           file pointer of temporary file
-  */
- void
- gtags_add_ref_sym(GTOP *const *gtop, FILE *ip)
- {
-       STRBUF *ib = strbuf_open(MAXBUFLEN);
- 
-       rewind(ip);
-       while (strbuf_fgets(ib, ip, STRBUF_NOCRLF) != NULL) {
-               /*
-                * [record format]
-                * <key>\t<data>\n
-                */
-               char *key = strbuf_value(ib);
-               char *data = strchr(key, '\t');
-               if (data == NULL)
-                       die("gtags_add_ref_sym: internal error.");
-               *data++ = '\0';
-               /*
-                * If the key is defined in GTAGS then put the record into 
GRTAGS
-                * else put it into GSYMS.
-                */
-               if (dbop_get(gtop[GTAGS]->dbop, key) != NULL)
-                       dbop_put(gtop[GRTAGS]->dbop, key, data);
-               else
-                       dbop_put(gtop[GSYMS]->dbop, key, data);
-       }
-       strbuf_close(ib);
- }
- /*
-  * gtags_move_ref_sym: move defined symbols from GSYMS to GRTAGS,
-  *                     and move undefined symbols from GRTAGS to GSYMS.
-  *
-  *    i) gtop         array of descripter of GTOP
-  */
- void
- gtags_move_ref_sym(GTOP *const *gtop)
- {
-       int result = 0;
-       DBOP *defdbop;
-       const char *defkey;
-       struct {
-               DBOP *dbop;
-               const char *key, *dat;
-               char saved_key[MAXKEYLEN+1];
-               STRBUF *saved_dat;
-       } ref, sym, *smaller, *larger;
- 
-       defdbop = gtop[GTAGS]->dbop;
-       defkey = dbop_first(defdbop, NULL, NULL, DBOP_KEY);
-       ref.dbop = gtop[GRTAGS]->dbop;
-       ref.dat = dbop_first(ref.dbop, NULL, NULL, 0);
-       ref.key = ref.dbop->lastkey;
-       ref.saved_dat = strbuf_open(MAXBUFLEN);
-       sym.dbop = gtop[GSYMS]->dbop;
-       sym.dat = dbop_first(sym.dbop, NULL, NULL, 0);
-       sym.key = sym.dbop->lastkey;
-       sym.saved_dat = strbuf_open(MAXBUFLEN);
-       for (;;) {
-               if (ref.dat != NULL && (sym.dat == NULL || strcmp(ref.key, 
sym.key) < 0)) {
-                       smaller = &ref;
-                       larger = &sym;
-               } else if (sym.dat != NULL) {
-                       smaller = &sym;
-                       larger = &ref;
-               } else {
-                       break;
-               }
-               while (defkey != NULL && (result = strcmp(smaller->key, 
defkey)) > 0)
-                       defkey = dbop_next(defdbop);
-               if (defkey != NULL && (smaller == &ref ? result < 0 : result == 
0)) {
-                       /*
-                        * Calling dbop_put doesn't affect the position of 
cursor,
-                        * but it invalidates the pointer to read data.
-                        */
-                       if (larger->dat != NULL && larger->key != 
larger->saved_key) {
-                               strlimcpy(larger->saved_key, larger->key, 
sizeof(larger->saved_key));
-                               larger->key = larger->saved_key;
-                               strbuf_reset(larger->saved_dat);
-                               strbuf_puts(larger->saved_dat, larger->dat);
-                               larger->dat = strbuf_value(larger->saved_dat);
-                       }
-                       dbop_put(larger->dbop, smaller->key, smaller->dat);
-                       dbop_delete(smaller->dbop, NULL);
-               }
-               smaller->dat = dbop_next(smaller->dbop);
-               smaller->key = smaller->dbop->lastkey;
-       }
- 
-       strbuf_close(ref.saved_dat);
-       strbuf_close(sym.saved_dat);
- }
- /*
   * gtags_put: put tag record with packing.
   *
   *    i)      gtop    descripter of GTOP
--- 518,523 ----
***************
*** 643,653 ****
                strbuf_puts(gtop->sb, gtop->format & GTAGS_COMPRESS ?
                        compress(ptable.part[PART_LINE].start, key) :
                        ptable.part[PART_LINE].start);
!               if (gtop->tmpfile_fp != NULL) {
!                       tmpfile_put(gtop->tmpfile_fp, key, 
strbuf_value(gtop->sb));
!               } else {
!                       dbop_put(gtop->dbop, key, strbuf_value(gtop->sb));
!               }
        }
        recover(&ptable);
  }
--- 584,590 ----
                strbuf_puts(gtop->sb, gtop->format & GTAGS_COMPRESS ?
                        compress(ptable.part[PART_LINE].start, key) :
                        ptable.part[PART_LINE].start);
!               dbop_put(gtop->dbop, key, strbuf_value(gtop->sb));
        }
        recover(&ptable);
  }
***************
*** 781,786 ****
--- 718,724 ----
                     tagline != NULL;
                     tagline = dbop_next(gtop->dbop))
                {
+                       VIRTUAL_GRTAGS_GSYMS_PROCESSING(gtop);
                        /* extract file id */
                        p = locatestring(tagline, " ", MATCH_FIRST);
                        if (p == NULL)
***************
*** 820,827 ****
                gtop->gtp.path = gtop->path_array[gtop->path_index++];
                return &gtop->gtp;
        } else if (gtop->flags & GTOP_KEY) {
!               return ((gtop->gtp.tag = dbop_first(gtop->dbop, key, preg, 
dbflags)) == NULL)
!                       ? NULL : &gtop->gtp;
        } else {
                if (gtop->vb == NULL)
                        gtop->vb = varray_open(sizeof(GTP), 200);
--- 758,771 ----
                gtop->gtp.path = gtop->path_array[gtop->path_index++];
                return &gtop->gtp;
        } else if (gtop->flags & GTOP_KEY) {
!               for (gtop->gtp.tag = dbop_first(gtop->dbop, key, preg, dbflags);
!                    gtop->gtp.tag != NULL;
!                    gtop->gtp.tag = dbop_next(gtop->dbop))
!               {
!                       VIRTUAL_GRTAGS_GSYMS_PROCESSING(gtop);
!                       break;
!               }
!               return gtop->gtp.tag ? &gtop->gtp : NULL;
        } else {
                if (gtop->vb == NULL)
                        gtop->vb = varray_open(sizeof(GTP), 200);
***************
*** 859,872 ****
  GTP *
  gtags_next(GTOP *gtop)
  {
        if (gtop->flags & GTOP_PATH) {
                if (gtop->path_index >= gtop->path_count)
                        return NULL;
                gtop->gtp.path = gtop->path_array[gtop->path_index++];
                return &gtop->gtp;
        } else if (gtop->flags & GTOP_KEY) {
!               return ((gtop->gtp.tag = dbop_next(gtop->dbop)) == NULL)
!                       ? NULL : &gtop->gtp;
        } else {
                /*
                 * End of segment.
--- 803,824 ----
  GTP *
  gtags_next(GTOP *gtop)
  {
+       const char *tagline;
+ 
        if (gtop->flags & GTOP_PATH) {
                if (gtop->path_index >= gtop->path_count)
                        return NULL;
                gtop->gtp.path = gtop->path_array[gtop->path_index++];
                return &gtop->gtp;
        } else if (gtop->flags & GTOP_KEY) {
!               for (gtop->gtp.tag = dbop_next(gtop->dbop);
!                    gtop->gtp.tag != NULL;
!                    gtop->gtp.tag = dbop_next(gtop->dbop))
!               {
!                       VIRTUAL_GRTAGS_GSYMS_PROCESSING(gtop);
!                       break;
!               }
!               return gtop->gtp.tag ? &gtop->gtp : NULL;
        } else {
                /*
                 * End of segment.
***************
*** 907,912 ****
--- 859,866 ----
                strhash_close(gtop->path_hash);
        gpath_close();
        dbop_close(gtop->dbop);
+       if (gtop->gtags)
+               dbop_close(gtop->gtags);
        free(gtop);
  }
  /*
***************
*** 1001,1011 ****
                                                strbuf_putn(gtop->sb, n);
                                        }
                                        if (strbuf_getlen(gtop->sb) > 
DBOP_PAGESIZE / 4) {
!                                               if (gtop->tmpfile_fp != NULL) {
!                                                       
tmpfile_put(gtop->tmpfile_fp, key, strbuf_value(gtop->sb));
!                                               } else {
!                                                       dbop_put(gtop->dbop, 
key, strbuf_value(gtop->sb));
!                                               }
                                                strbuf_setlen(gtop->sb, 
header_offset);
                                        }
                                }
--- 955,961 ----
                                                strbuf_putn(gtop->sb, n);
                                        }
                                        if (strbuf_getlen(gtop->sb) > 
DBOP_PAGESIZE / 4) {
!                                               dbop_put(gtop->dbop, key, 
strbuf_value(gtop->sb));
                                                strbuf_setlen(gtop->sb, 
header_offset);
                                        }
                                }
***************
*** 1029,1050 ****
                                        strbuf_putc(gtop->sb, ',');
                                strbuf_putn(gtop->sb, n);
                                if (strbuf_getlen(gtop->sb) > DBOP_PAGESIZE / 
4) {
!                                       if (gtop->tmpfile_fp != NULL) {
!                                               tmpfile_put(gtop->tmpfile_fp, 
key, strbuf_value(gtop->sb));
!                                       } else {
!                                               dbop_put(gtop->dbop, key, 
strbuf_value(gtop->sb));
!                                       }
                                        strbuf_setlen(gtop->sb, header_offset);
                                }
                                last = n;
                        }
                }
                if (strbuf_getlen(gtop->sb) > header_offset) {
!                       if (gtop->tmpfile_fp != NULL) {
!                               tmpfile_put(gtop->tmpfile_fp, key, 
strbuf_value(gtop->sb));
!                       } else {
!                               dbop_put(gtop->dbop, key, 
strbuf_value(gtop->sb));
!                       }
                }
                /* Free line number table */
                varray_close(vb);
--- 979,992 ----
                                        strbuf_putc(gtop->sb, ',');
                                strbuf_putn(gtop->sb, n);
                                if (strbuf_getlen(gtop->sb) > DBOP_PAGESIZE / 
4) {
!                                       dbop_put(gtop->dbop, key, 
strbuf_value(gtop->sb));
                                        strbuf_setlen(gtop->sb, header_offset);
                                }
                                last = n;
                        }
                }
                if (strbuf_getlen(gtop->sb) > header_offset) {
!                       dbop_put(gtop->dbop, key, strbuf_value(gtop->sb));
                }
                /* Free line number table */
                varray_close(vb);
***************
*** 1081,1086 ****
--- 1023,1029 ----
         */
        gtop->cur_tagname[0] = '\0';
        while ((tagline = dbop_next(gtop->dbop)) != NULL) {
+               VIRTUAL_GRTAGS_GSYMS_PROCESSING(gtop);
                /*
                 * get tag name and line number.
                 *
Index: libutil/gtagsop.h
===================================================================
RCS file: /sources/global/global/libutil/gtagsop.h,v
retrieving revision 1.45
diff -c -r1.45 gtagsop.h
*** libutil/gtagsop.h   2 Mar 2010 02:02:53 -0000       1.45
--- libutil/gtagsop.h   13 May 2010 09:30:42 -0000
***************
*** 73,78 ****
--- 73,79 ----
  
  typedef struct {
        DBOP *dbop;                     /* descripter of DBOP */
+       DBOP *gtags;                    /* descripter of GTAGS */
        int format_version;             /* format version */
        int format;                     /* GTAGS_COMPACT, GTAGS_COMPRESS */
        int mode;                       /* mode */
***************
*** 103,122 ****
        STRBUF *sb;                     /* string buffer */
        /* used for compact format and path name only read */
        STRHASH *path_hash;
-       /*
-        * Stuff for 1-pass parsing
-        *
-        * If you set file pointer to tmpfile_fp, gtags_put() and 
gtags_put_using()
-        * write records to the temporary file instead of db(3) file.
-        */
-       FILE *tmpfile_fp;
  } GTOP;
  
  const char *dbname(int);
  GTOP *gtags_open(const char *, const char *, int, int, int);
  void gtags_put(GTOP *, const char *, const char *);
  void gtags_put_using(GTOP *, const char *, int, const char *, const char *);
! void gtags_flush(GTOP *, const char *);
  void gtags_add_ref_sym(GTOP *const *, FILE *);
  void gtags_move_ref_sym(GTOP *const *);
  void gtags_delete(GTOP *, IDSET *);
--- 104,116 ----
        STRBUF *sb;                     /* string buffer */
        /* used for compact format and path name only read */
        STRHASH *path_hash;
  } GTOP;
  
  const char *dbname(int);
  GTOP *gtags_open(const char *, const char *, int, int, int);
  void gtags_put(GTOP *, const char *, const char *);
  void gtags_put_using(GTOP *, const char *, int, const char *, const char *);
! /* void gtags_flush(GTOP *, const char *);*/
  void gtags_add_ref_sym(GTOP *const *, FILE *);
  void gtags_move_ref_sym(GTOP *const *);
  void gtags_delete(GTOP *, IDSET *);
--
Shigio YAMAGUCHI <address@hidden>
PGP fingerprint: D1CB 0B89 B346 4AB6 5663  C4B6 3CA5 BBB3 57BE DDA3



reply via email to

[Prev in Thread] Current Thread [Next in Thread]