gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] "librifying" libarch


From: Tom Lord
Subject: [Gnu-arch-users] "librifying" libarch
Date: Mon, 20 Oct 2003 09:55:21 -0700 (PDT)



Step 1) Don't do it.

Step 2) Oh you're going to do it anyway?

Here is a series of correctness-preserving transforms.  After each
step in this plan, tla should still be completely functional.  At the
end, you'll have an all-singing, all-dancing, thread-safe,
bindings-friendly, interruptable libarch api AND, as a bonus, you
won't have horked libarch beyond all reasonable approximations of
sanity.

a) Split libarch into two directories:  libarch-cmds
   and libarch-utils.

   libarch-utils should contain just short-running functions
   which help with parsing or finding things in the file system.

   For example, the namespace code should go in -utils, the 
   code that looks for a tree-root, the code that parses a 
   patch log.

   And on the other hand, the cmd-*.c files, archives.[ch], 
   make-changeset.[ch], and apply-changeset.[ch] go in -cmds.

   As a sort of sanity check, after this change there should be a
   strict layering in which libarch-utils does not call any function
   in libarch-cmds.


b) Eliminate exits from the -utils.

   Leave the invariants alone, but otherwise, if any of the -utils can
   exit (by panic, exit, or a safe_* call), change them to return an
   error value and update callers.

   In the -cmds, if a caller now has to check for a new kind of error
   from a -utils routine, the -cmd caller should print an error and
   exit at that point.

   (It may be worth checking in libawk and libfsutils for other places
   where new error returns are needed.)

   (It is not obviously desirable to check for malloc-like functions
   returning 0.  That _can_ be done here if you really want to, but it
   can also be postponed indefinately, relying "for now" on the
   underlying allocator to simply terminate the process -- the most
   likely correct response 99.9% of the time.)


c) Switch to pool-based allocation in libawk.

   Starting in libawk and libfsutils, switch to pool-based allocation.

   Currently, allocation uses calls like:

        lim_malloc (0, 42);     /* allocate 42 bytes */

        str_save (0, "hello world\n");  /* allocate a string copy */

    That 0 parameter is an optional alloc_limits value (see
    libhackerlab).

    Change all of those 0s to a parameter which is a new argument to 
    the calling function.   Update all callers to:

        1) pass 0 if they do not themselves have an alloc_limits 
           parameter

        2) pass the alloc limits parameter otherwise


d) Repeat step (c) for the -utils directory


e) Repeat step (c) for the -cmds directory


f) Extend the alloc_limits mechanism in hackerlab

   Add to it a facility for attaching arbitrary data to an
   alloc_limits pool (for example, see how this is used in (h)).


g) Eliminate the safe_ i/o functions

   (The safe_ and vu_ functions are essentially the familiar file
   system interface: open, close, read, write, etc.)

   Find all the remaining safe_* calls in -cmds.   Replace
   each of these with the corresponding vu_ call, check the error
   return and print a message and exit upon error.


h) Write a thin wrapper for the hackerlab vu_ functions.

   Call the new wrapper pvu_.   It whould work by adding a new
   first parameter to each function: an alloc_limits value.
   When a descriptor is allocated by a pvu_ function, and the
   alloc_limits parameter is not 0, lim_no_allocations_limits, 
   or lim_use_malloc_limits, then it should note the descriptor
   in data attached to the limits structure.

   When a descriptor is closed, it should remove it from the list
   of remembered descriptors.


i) Change all of the vu_ calls, throughout the library, to pvu_ calls.


j) Write pvu_threadsafe_chdir

   Roughly, this function should be:

     pvu_threadsafe_chdir (alloc_limits limits, char * path)
        here = pvu_open (limits, ".");
        pvu_chdir (limits, path);
        there = pvu_open (limits, ".");
        remember `there' as the thread_cwd in limits;
        pvu_fchdir (limits, here);
        pvu_close (limits, here);

   (It's actually more complicated since you have to permit it
   to be called more than once.)

   All of the pvu functions that take a path name should be
   modified.   For example, pvu_open becomes:

        if (limits has a thread_cwd value)
          here = pvu_open (limits, ".");
          pvu_fchdir (limits, there);

        answer = open();

        if (limits has a thread_cwd value)
          pvu_fchdir (limits, here);
          pvu_close (limits, here);


k) write pvu_chdir_for_thread

   If the limits has a `there' directory, actually fchdir to 
   it and return.


l) write pvu_current_working_directory (see src/hackerlab/fs/cwd.[ch])

   Obviously, pvu_current_working_directory should return the
   directory name of the `there' directory if there is one.


m) insert pvu_chdir_for_thread calls in forked subprocesses

   Find all calls to `fork' in libarch.   In each subprocess,
   insert a call to pvu_chdir_for_thread


n) replace all libarch calls to pvu_chdir with pvu_threadsafe_chdir


o) review all calls to `wait'-family functions

   Make sure I did the right thing and wait for a specific pid.


p) make libhackerlab threadsafe in the trivial way

   Add a global mutex to libhackerlab and make all of the functions it
   contains mutually exclusive.   (This can be refined later or in
   parallel with the other steps).


q) write pvu_close_everything

   This function should close all descriptors that are remembered
   within a given alloc-limits.


r) write make_pool_allocator, and pool_cleanup

   This should return a new alloc_limits that will remember all
   allocated, not-yet-freed memory.

   pool_cleanup should call pvu_close_everything and also free
   all still-allocated memory.


s) write make_pseudo_process_allocator and friends

   make_pseudo_process_allocator should return an alloc_limits
   created by make_pool_allocator.

   The functions:

        pproc_set_status
        pproc_get_status

   should allow you to set and retrieve an integer (exit status)
   which is remembered in the alloc_limits.

   The function:

       pproc_set_stderr_fd
       pproc_get_stderr_fd

   should allow you to remember an fd for error messages in 
   the alloc limits.


t) add an alloc_limits parameter to the `arch_cmd_*' functions.

   Before this step, the arch_cmd_* functions should be the last
   remaining libarch functions that pass 0 where an alloc_limits is
   needed.

   After this step, arch_call_cmd should be the last remaining
   function with that property.


u) add an alloc_limits parameter to arch_call_cmd

   and construt a pseudo_process_allocator in the front-end
   program, tla.c


v) split -cmds into -cmds and -ops

   Leave the files cmd.[ch] and cmd-*.[ch] in -cmds, and 
   move everything else to -ops


w) write pproc_exec_setjmp and pproc_exit_longjmp

   pproc_exec_setjmp should remember a jump buffer
   in an alloc_limits.

   pproc_exit_longjmp should call pproc_set_status and jmp
   through that jmpbuffer


x) write pproc_panic

   Similar to ordinary `panic' except that it should write to the
   pproc_get_stderr_fd and call pproc_exit_longjmp (if those are set)
   rather than writing to 2 and calling _exit().


y) replace all libarch calls to `panic' and `exit'

   with calls to pproc_panic and pproc_exit_longjmp


z) modify all direct stderr output

   Replace all output-to-stderr calls in libarch with calls that
   write to the pproc_get_stderr_fd


a1) begin to design the API, eliminate stdout output in -ops

   A crude approximation of the ideal API is that every subcommand
   should be an API function.    But that's obviously a bit _too_
   crude.   For example, `abrowse' simply does too much for a 
   reasonable API.   The `--summary' option (to print the summary line
   of various log messages) to various commands doesn't belong in the 
   API.

   Design an API by examining the list of subcommands, and reducing
   their functionality to a sane set of primitives.   (In many cases,
   the -ops directory will already have a function that directly 
   implements a reasonable choice of "primitive".)

   For each such primitive, examine the -cmds which should use it 
   and make sure that they do.

   None of the primitive operations should produce output directly.
   There are two cases:

        a) the primitive should obviously just return a value

           (As an example, the existing function
           arch_archive_revisions is the primitive underlying the
           command `tla revisions' and it already works by returning a
           value.)

        b) the primitive should obviously produce its "return values"
           incrementally

           (As an example, arch_apply_changeset is a long running
           command.  Part of its "return value" are the messages
           about the changes its making:

                A file-one.c
                M file-two.c
                ....

           It's highly desirable that such return values are produced 
           incrementally.   In an interactive front-end, you want to 
           see these values as they are produced rather than have the 
           command sit silently for along time and then display them
           all at once.)

   In the case of (b), make sure that the primitives work by invoking
   a callback of the output, and, if sensible, separately construct 
   a return value.    A paradigmatic example of this is the way that
   `struct arch_make_changeset_report' is used in make-changeset.[ch].

   After this step, the only output statements in -ops should be error
   messages to pproc_get_stderr_fd and ordinary output via callbacks.


b1) cleanup step

   Split -ops into -api and -apiutils.    There should be a strict 
   layering as follows:

        anything can call libfsutils and libawk, otherwise:

        libarch-cmds calls only itself, libarch-api and libarch-utils

        libarch-api calls only itself, libarch-apiutils, and libarch-utils
           
        libarch-apiutils calls only itself and libarch-utils

        libarch-utils calls only itself

        tla calls only libarch-cmds


c1) write -bindings

   The -bindings functions should directly should directly mirror the
   -api functions except that they should not take an alloc_limits 
   value[*].

   Instead, on each call, they should allocate their own
   pseudo-process limits and perform a pproc_exec_setjmp

   Before returning, they should each copy the return value to 
   non-pool-allocated memory and cleanup the pproc pool.

   Although they don't take an alloc_limits argument, the -binding 
   functions should take an `int * poll' argument.   While *poll is 0
   the API function should run normally, if it is not 0, the binding
   function should return quickly.

   The -binding functions should remember the `poll' parameter in the
   alloc_limits structure it creates.

   [*] The -bindings functions _can_ take an alloc_limits which they
   use when copying return values, but still need to make a
   pproc_limits which is passed to the -api routines


d1) write and sprinkle arch_exit_poll

  The arch_exit_poll function should examine the `*poll' remembered in
  an alloc limits.   If it is not 0, it should print a message to 
  pproc_get_stderr_fd and call pproc_exit_longjmp.

  Calls to arch_exit_poll should be sprinkled in long-running parts of
  libarch-api and libarch-apiutils.


e1) take a swig

   Enjoy -bindings!


The impact on the cost of making further changes to tla is not huge
under this approach and _mostly_ positive.  The resulting bindings
will be thread-safe and safely interruptable.

-t





reply via email to

[Prev in Thread] Current Thread [Next in Thread]